{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "# Demo: Custom observation types\n", "In this demo, you can find a demonstration on how to use Observation types." ] }, { "cell_type": "code", "execution_count": 1, "id": "1", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:02.711455Z", "iopub.status.busy": "2025-05-14T11:45:02.709905Z", "iopub.status.idle": "2025-05-14T11:45:06.529754Z", "shell.execute_reply": "2025-05-14T11:45:06.528515Z" } }, "outputs": [], "source": [ "import metobs_toolkit\n", "\n", "#Initialize an empty Dataset\n", "your_dataset = metobs_toolkit.Dataset()" ] }, { "cell_type": "markdown", "id": "2", "metadata": {}, "source": [ "## Default observation types\n", "\n", "An observation record must always be linked to an *observation type* which is specified by the ``Obstype`` class. \n", "An Obstype represents one observation type (i.g. temperature), and it handles unit conversions and string representations of an observation type. \n", "\n", "By default, a set of standard observationtypes are stored in a Dataset:" ] }, { "cell_type": "code", "execution_count": 2, "id": "3", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.534602Z", "iopub.status.busy": "2025-05-14T11:45:06.533577Z", "iopub.status.idle": "2025-05-14T11:45:06.550277Z", "shell.execute_reply": "2025-05-14T11:45:06.548860Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "{'temp': Obstype instance of temp,\n", " 'humidity': Obstype instance of humidity,\n", " 'radiation_temp': Obstype instance of radiation_temp,\n", " 'pressure': Obstype instance of pressure,\n", " 'pressure_at_sea_level': Obstype instance of pressure_at_sea_level,\n", " 'precip': Obstype instance of precip,\n", " 'precip_sum': Obstype instance of precip_sum,\n", " 'wind_speed': Obstype instance of wind_speed,\n", " 'wind_gust': Obstype instance of wind_gust,\n", " 'wind_direction': Obstype instance of wind_direction}" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "your_dataset.obstypes" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "## The Obstype class\n", "\n", "As an example we take a look at the temperature obstype, which is an instance of the``Obstype`` class." ] }, { "cell_type": "code", "execution_count": 3, "id": "5", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.555889Z", "iopub.status.busy": "2025-05-14T11:45:06.555048Z", "iopub.status.idle": "2025-05-14T11:45:06.570172Z", "shell.execute_reply": "2025-05-14T11:45:06.568936Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", " General info of Obstype \n", "================================================================================\n", "\n", "temp observation with:\n", " -standard unit: degree_Celsius\n", " -description: 2m - temperature\n", "\n" ] } ], "source": [ "temperature_obstype = your_dataset.obstypes['temp']\n", "temperature_obstype.get_info()" ] }, { "cell_type": "markdown", "id": "6", "metadata": {}, "source": [ "The most important attribute of an ``Obstype`` is it's standard unit. That is the unit to transform and store values in. For temperature this is by default set to degrees Celsius." ] }, { "cell_type": "code", "execution_count": 4, "id": "7", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.575556Z", "iopub.status.busy": "2025-05-14T11:45:06.574387Z", "iopub.status.idle": "2025-05-14T11:45:06.585068Z", "shell.execute_reply": "2025-05-14T11:45:06.583806Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "'degree_Celsius'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temperature_obstype.std_unit" ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "# Creating a new observationtype\n", "\n", "In practice it is most common that a new observation type is defined before importing the raw dataset. When creating a template file with the ``metobs_toolkit.build_template_prompt()`` function, the prompt will in the end print out snippet of code that will create the new observation type. \n", "\n", "As an illustration, a we will define a new observationtype for gas-concentrations." ] }, { "cell_type": "code", "execution_count": 5, "id": "9", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.591642Z", "iopub.status.busy": "2025-05-14T11:45:06.590419Z", "iopub.status.idle": "2025-05-14T11:45:06.605509Z", "shell.execute_reply": "2025-05-14T11:45:06.603714Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "gas_concentration = metobs_toolkit.Obstype(\n", " obsname='gas_ratio',\n", " std_unit='ppm', #see all available units: https://github.com/hgrecco/pint/blob/master/pint/default_en.txt\n", " description='The gas concentration measured a 2m',\n", " )" ] }, { "cell_type": "markdown", "id": "10", "metadata": {}, "source": [ "If you have raw data with concentrations you add them before importing the data." ] }, { "cell_type": "code", "execution_count": 6, "id": "11", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.612401Z", "iopub.status.busy": "2025-05-14T11:45:06.611669Z", "iopub.status.idle": "2025-05-14T11:45:06.623303Z", "shell.execute_reply": "2025-05-14T11:45:06.620942Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "your_dataset = metobs_toolkit.Dataset() #Initiate an empty dataset\n", "\n", "your_dataset.add_new_observationtype(gas_concentration) #add the new observation\n", "\n", "#your_dataset.import_data_from_file( ... )\n" ] }, { "cell_type": "markdown", "id": "12", "metadata": {}, "source": [ "## Obstypes for Modeldata\n", "### ModelObstype\n", "An extension to the `Obstype` class is the `ModelObstype` class which is used for interacting with GEE dataset. In addition to a regular `Obstype` a `ModelObstype` contains the info which band (of the GEE dataset) represents the observation, and handles the unit conversion. \n", "\n", "*Note:* All methods that work on `Obstype` do also work on `ModelObstype`.\n", "\n", "Since a ``ModelObstype`` is specific to a specific GEE dataset, the ``ModelObstype``s are stored per ``GeeDatasetManager`` (= the class in the MetObs-toolkit that defines a Google Earth Engine dataset). \n", "\n", "The following GEE dataset are define by default:\n", "\n", "A `ModelObstype` is specific to one GEE dataset. Therefore the known modelobstypes are stored in each `GeeDynamicDatasetManager`. As a default, there is an ERA5-land `GeeDynamicDatasetManager` stored in all Datasets." ] }, { "cell_type": "code", "execution_count": 7, "id": "13", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.631739Z", "iopub.status.busy": "2025-05-14T11:45:06.630753Z", "iopub.status.idle": "2025-05-14T11:45:06.642880Z", "shell.execute_reply": "2025-05-14T11:45:06.641388Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "{'LCZ': GeeStaticDatasetManager representation of LCZ ,\n", " 'altitude': GeeStaticDatasetManager representation of altitude ,\n", " 'worldcover': GeeStaticDatasetManager representation of worldcover ,\n", " 'ERA5-land': GeeDynamicDatasetManager representation of ERA5-land }" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "metobs_toolkit.default_GEE_datasets" ] }, { "cell_type": "markdown", "id": "14", "metadata": {}, "source": [ "As an example we take a look in the ´ERA5-land´, which is a ``GeeDynamicDatasetManager`` representing the ERA5 dataset on GEE.\n", "\n", "By using the ``get_info()`` (or by accessing the ``.modelobstypes`` attribute) we can see the present modelobstypes" ] }, { "cell_type": "code", "execution_count": 8, "id": "15", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.647601Z", "iopub.status.busy": "2025-05-14T11:45:06.647050Z", "iopub.status.idle": "2025-05-14T11:45:06.661639Z", "shell.execute_reply": "2025-05-14T11:45:06.658271Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", " General info of GEEDynamicDataset \n", "================================================================================\n", "\n", "\n", "--- GEE Dataset details ---\n", "\n", " -name: ERA5-land\n", " -location: ECMWF/ERA5_LAND/HOURLY\n", " -value_type: numeric\n", " -scale: 2500\n", " -is_static: False\n", " -is_image: False\n", " -is_mosaic: False\n", " -credentials: \n", " -time res: 1h\n", "\n", "--- Known Modelobstypes ---\n", "\n", " -temp : ModelObstype instance of temp\n", " -conversion: kelvin --> degree_Celsius\n", " -pressure : ModelObstype instance of pressure\n", " -conversion: 1.000 pascal --> hectopascal\n", " -wind : ModelObstype_Vectorfield instance of wind\n", " -vectorfield that will be converted to: \n", " -wind_speed\n", " -wind_direction\n", " -conversion: meter / second --> meter / second\n", "\n" ] } ], "source": [ "era5_model = metobs_toolkit.default_GEE_datasets['ERA5-land']\n", "era5_model.get_info()" ] }, { "cell_type": "code", "execution_count": 9, "id": "16", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.665787Z", "iopub.status.busy": "2025-05-14T11:45:06.665239Z", "iopub.status.idle": "2025-05-14T11:45:06.676467Z", "shell.execute_reply": "2025-05-14T11:45:06.675195Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "{'temp': ModelObstype instance of temp,\n", " 'pressure': ModelObstype instance of pressure,\n", " 'wind': ModelObstype_Vectorfield instance of wind}" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#or by accessing the attribute\n", "era5_model.modelobstypes\n" ] }, { "cell_type": "markdown", "id": "17", "metadata": {}, "source": [ "As an example, we will create a new ModelObstype that represents the accumulated precipitation as is present in the ERA5_land GEE dataset." ] }, { "cell_type": "code", "execution_count": 10, "id": "18", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.681264Z", "iopub.status.busy": "2025-05-14T11:45:06.680390Z", "iopub.status.idle": "2025-05-14T11:45:06.694687Z", "shell.execute_reply": "2025-05-14T11:45:06.693430Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "{'temp': ModelObstype instance of temp,\n", " 'pressure': ModelObstype instance of pressure,\n", " 'wind': ModelObstype_Vectorfield instance of wind,\n", " 'cumulated_precip': ModelObstype instance of cumulated_precip}" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "from datetime import datetime\n", "#Create a new observation type\n", "precipitation = metobs_toolkit.Obstype(obsname='cumulated_precip',\n", " std_unit='mm',\n", " description='Cumulated total precipitation since midnight per squared meter')\n", "\n", "#Create the ModelObstype\n", "precip_in_era5 = metobs_toolkit.ModelObstype(\n", " obstype=precipitation,\n", " model_band='total_precipitation', #look this up: https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ERA5_LAND_HOURLY#bands \n", " model_unit='m',\n", " )\n", "# Add it to the ERA5 model\n", "era5_model.add_modelobstype(precip_in_era5)\n", "\n", "era5_model.modelobstypes" ] }, { "cell_type": "code", "execution_count": 11, "id": "19", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.697876Z", "iopub.status.busy": "2025-05-14T11:45:06.697563Z", "iopub.status.idle": "2025-05-14T11:45:06.703636Z", "shell.execute_reply": "2025-05-14T11:45:06.702860Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", " General info of ModelObstype \n", "================================================================================\n", "\n", "\n", "--- Obstype info ---\n", "\n", "cumulated_precip observation with:\n", " -standard unit: millimeter\n", " -description: Cumulated total precipitation since midnight per squared mete...\n", "\n", "--- Model related info ---\n", "\n", " -corresponding bandname: total_precipitation\n", " -original modeldata unit: 1.000 meter\n", "\n" ] } ], "source": [ "precip_in_era5.get_info()" ] }, { "cell_type": "markdown", "id": "20", "metadata": {}, "source": [ "Now you can extract cumulated precipitation data directly from GEE. We refer to the [GEE Notebook](gee_example.ipynb) for an example on extracting ERA5 data." ] }, { "cell_type": "markdown", "id": "21", "metadata": {}, "source": [ "### ModelObstype_Vectorfield\n", "At a specific height, the wind can be seen (by approximation) as a 2D vector field. The vector components are often stored in different bands/variables in a model.\n", "\n", "For example, if you want the 10m windspeed from ERA5 you cannot find a band for the windspeed. There are bands for the\n", "u and v component of the wind. \n", "\n", "The `ModelObstype_Vectorfield` class represents a modelobstype, for which there does not exist a band, but can be constructed from (orthogonal) components. The vector amplitudes and direction are computed, and the corresponding `ModelObstype`'s are created.\n", "\n", "By default, the *wind* is added as a `ModelObstype_vectorfield` for the ERA5-land `GeeDynamicDataset`." ] }, { "cell_type": "code", "execution_count": 12, "id": "22", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.707762Z", "iopub.status.busy": "2025-05-14T11:45:06.707228Z", "iopub.status.idle": "2025-05-14T11:45:06.726768Z", "shell.execute_reply": "2025-05-14T11:45:06.720729Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "{'temp': ModelObstype instance of temp,\n", " 'pressure': ModelObstype instance of pressure,\n", " 'wind': ModelObstype_Vectorfield instance of wind,\n", " 'cumulated_precip': ModelObstype instance of cumulated_precip}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "era5_model.modelobstypes" ] }, { "cell_type": "code", "execution_count": 13, "id": "23", "metadata": { "execution": { "iopub.execute_input": "2025-05-14T11:45:06.736593Z", "iopub.status.busy": "2025-05-14T11:45:06.735359Z", "iopub.status.idle": "2025-05-14T11:45:06.752625Z", "shell.execute_reply": "2025-05-14T11:45:06.750901Z" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "================================================================================\n", " General info of ModelObstype_Vectorfield \n", "================================================================================\n", "\n", "\n", "--- Obstype info ---\n", "\n", "wind observation with:\n", " -standard unit: meter / second\n", " -description: 2D-vector combined 10m windspeed. Care should be taken when c...\n", "\n", "--- Model related info ---\n", "\n", " -U-component bandname: u_component_of_wind_10m\n", " -in meter / second\n", " -V-component bandname: v_component_of_wind_10m\n", " -in meter / second\n", "\n" ] } ], "source": [ "era5_wind = era5_model.modelobstypes['wind']\n", "era5_wind.get_info()" ] }, { "cell_type": "markdown", "id": "24", "metadata": {}, "source": [ "So we can see that *wind* corresponds with two bands (the u and v component)." ] }, { "cell_type": "markdown", "id": "25", "metadata": {}, "source": [ "When extracting the wind data from era5 (on GEE) the toolkit will\n", " 1. Download the u and v wind components for your period and locations.\n", " 2. Convert each component to its standard units (m/s for the wind components).\n", " 3. Compute the amplitude and the direction (in degrees from North, clockwise).\n", " 4. Add a `ModelObstype` for the amplitude and one for the direction.\n", "\n", " For an example, see the [GEE Notebook](gee_example.ipynb)." ] } ], "metadata": { "kernelspec": { "display_name": "metobs_dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.10" }, "widgets": { "application/vnd.jupyter.widget-state+json": { "state": { "6101f8135943461abd0a87e4dfd6dec9": { "model_module": "@jupyter-widgets/base", "model_module_version": "2.0.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "2.0.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "2.0.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border_bottom": null, "border_left": null, "border_right": null, "border_top": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "padding": null, "right": null, "top": null, "visibility": null, "width": "500px" } } }, "version_major": 2, "version_minor": 0 } } }, "nbformat": 4, "nbformat_minor": 5 }