{
"cells": [
{
"cell_type": "markdown",
"id": "0",
"metadata": {
"editable": true,
"slideshow": {
"slide_type": ""
},
"tags": []
},
"source": [
"# Demo: Custom observation types\n",
"In this demo, you can find a demonstration on how to use Observation types."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:02.711455Z",
"iopub.status.busy": "2025-05-14T11:45:02.709905Z",
"iopub.status.idle": "2025-05-14T11:45:06.529754Z",
"shell.execute_reply": "2025-05-14T11:45:06.528515Z"
}
},
"outputs": [],
"source": [
"import metobs_toolkit\n",
"\n",
"#Initialize an empty Dataset\n",
"your_dataset = metobs_toolkit.Dataset()"
]
},
{
"cell_type": "markdown",
"id": "2",
"metadata": {},
"source": [
"## Default observation types\n",
"\n",
"An observation record must always be linked to an *observation type* which is specified by the ``Obstype`` class. \n",
"An Obstype represents one observation type (i.g. temperature), and it handles unit conversions and string representations of an observation type. \n",
"\n",
"By default, a set of standard observationtypes are stored in a Dataset:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "3",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.534602Z",
"iopub.status.busy": "2025-05-14T11:45:06.533577Z",
"iopub.status.idle": "2025-05-14T11:45:06.550277Z",
"shell.execute_reply": "2025-05-14T11:45:06.548860Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'temp': Obstype instance of temp,\n",
" 'humidity': Obstype instance of humidity,\n",
" 'radiation_temp': Obstype instance of radiation_temp,\n",
" 'pressure': Obstype instance of pressure,\n",
" 'pressure_at_sea_level': Obstype instance of pressure_at_sea_level,\n",
" 'precip': Obstype instance of precip,\n",
" 'precip_sum': Obstype instance of precip_sum,\n",
" 'wind_speed': Obstype instance of wind_speed,\n",
" 'wind_gust': Obstype instance of wind_gust,\n",
" 'wind_direction': Obstype instance of wind_direction}"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"your_dataset.obstypes"
]
},
{
"cell_type": "markdown",
"id": "4",
"metadata": {},
"source": [
"## The Obstype class\n",
"\n",
"As an example we take a look at the temperature obstype, which is an instance of the``Obstype`` class."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "5",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.555889Z",
"iopub.status.busy": "2025-05-14T11:45:06.555048Z",
"iopub.status.idle": "2025-05-14T11:45:06.570172Z",
"shell.execute_reply": "2025-05-14T11:45:06.568936Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"================================================================================\n",
" General info of Obstype \n",
"================================================================================\n",
"\n",
"temp observation with:\n",
" -standard unit: degree_Celsius\n",
" -description: 2m - temperature\n",
"\n"
]
}
],
"source": [
"temperature_obstype = your_dataset.obstypes['temp']\n",
"temperature_obstype.get_info()"
]
},
{
"cell_type": "markdown",
"id": "6",
"metadata": {},
"source": [
"The most important attribute of an ``Obstype`` is it's standard unit. That is the unit to transform and store values in. For temperature this is by default set to degrees Celsius."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "7",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.575556Z",
"iopub.status.busy": "2025-05-14T11:45:06.574387Z",
"iopub.status.idle": "2025-05-14T11:45:06.585068Z",
"shell.execute_reply": "2025-05-14T11:45:06.583806Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"'degree_Celsius'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"temperature_obstype.std_unit"
]
},
{
"cell_type": "markdown",
"id": "8",
"metadata": {},
"source": [
"# Creating a new observationtype\n",
"\n",
"In practice it is most common that a new observation type is defined before importing the raw dataset. When creating a template file with the ``metobs_toolkit.build_template_prompt()`` function, the prompt will in the end print out snippet of code that will create the new observation type. \n",
"\n",
"As an illustration, a we will define a new observationtype for gas-concentrations."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "9",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.591642Z",
"iopub.status.busy": "2025-05-14T11:45:06.590419Z",
"iopub.status.idle": "2025-05-14T11:45:06.605509Z",
"shell.execute_reply": "2025-05-14T11:45:06.603714Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"gas_concentration = metobs_toolkit.Obstype(\n",
" obsname='gas_ratio',\n",
" std_unit='ppm', #see all available units: https://github.com/hgrecco/pint/blob/master/pint/default_en.txt\n",
" description='The gas concentration measured a 2m',\n",
" )"
]
},
{
"cell_type": "markdown",
"id": "10",
"metadata": {},
"source": [
"If you have raw data with concentrations you add them before importing the data."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "11",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.612401Z",
"iopub.status.busy": "2025-05-14T11:45:06.611669Z",
"iopub.status.idle": "2025-05-14T11:45:06.623303Z",
"shell.execute_reply": "2025-05-14T11:45:06.620942Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"your_dataset = metobs_toolkit.Dataset() #Initiate an empty dataset\n",
"\n",
"your_dataset.add_new_observationtype(gas_concentration) #add the new observation\n",
"\n",
"#your_dataset.import_data_from_file( ... )\n"
]
},
{
"cell_type": "markdown",
"id": "12",
"metadata": {},
"source": [
"## Obstypes for Modeldata\n",
"### ModelObstype\n",
"An extension to the `Obstype` class is the `ModelObstype` class which is used for interacting with GEE dataset. In addition to a regular `Obstype` a `ModelObstype` contains the info which band (of the GEE dataset) represents the observation, and handles the unit conversion. \n",
"\n",
"*Note:* All methods that work on `Obstype` do also work on `ModelObstype`.\n",
"\n",
"Since a ``ModelObstype`` is specific to a specific GEE dataset, the ``ModelObstype``s are stored per ``GeeDatasetManager`` (= the class in the MetObs-toolkit that defines a Google Earth Engine dataset). \n",
"\n",
"The following GEE dataset are define by default:\n",
"\n",
"A `ModelObstype` is specific to one GEE dataset. Therefore the known modelobstypes are stored in each `GeeDynamicDatasetManager`. As a default, there is an ERA5-land `GeeDynamicDatasetManager` stored in all Datasets."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "13",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.631739Z",
"iopub.status.busy": "2025-05-14T11:45:06.630753Z",
"iopub.status.idle": "2025-05-14T11:45:06.642880Z",
"shell.execute_reply": "2025-05-14T11:45:06.641388Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'LCZ': GeeStaticDatasetManager representation of LCZ ,\n",
" 'altitude': GeeStaticDatasetManager representation of altitude ,\n",
" 'worldcover': GeeStaticDatasetManager representation of worldcover ,\n",
" 'ERA5-land': GeeDynamicDatasetManager representation of ERA5-land }"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"metobs_toolkit.default_GEE_datasets"
]
},
{
"cell_type": "markdown",
"id": "14",
"metadata": {},
"source": [
"As an example we take a look in the ´ERA5-land´, which is a ``GeeDynamicDatasetManager`` representing the ERA5 dataset on GEE.\n",
"\n",
"By using the ``get_info()`` (or by accessing the ``.modelobstypes`` attribute) we can see the present modelobstypes"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "15",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.647601Z",
"iopub.status.busy": "2025-05-14T11:45:06.647050Z",
"iopub.status.idle": "2025-05-14T11:45:06.661639Z",
"shell.execute_reply": "2025-05-14T11:45:06.658271Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"================================================================================\n",
" General info of GEEDynamicDataset \n",
"================================================================================\n",
"\n",
"\n",
"--- GEE Dataset details ---\n",
"\n",
" -name: ERA5-land\n",
" -location: ECMWF/ERA5_LAND/HOURLY\n",
" -value_type: numeric\n",
" -scale: 2500\n",
" -is_static: False\n",
" -is_image: False\n",
" -is_mosaic: False\n",
" -credentials: \n",
" -time res: 1h\n",
"\n",
"--- Known Modelobstypes ---\n",
"\n",
" -temp : ModelObstype instance of temp\n",
" -conversion: kelvin --> degree_Celsius\n",
" -pressure : ModelObstype instance of pressure\n",
" -conversion: 1.000 pascal --> hectopascal\n",
" -wind : ModelObstype_Vectorfield instance of wind\n",
" -vectorfield that will be converted to: \n",
" -wind_speed\n",
" -wind_direction\n",
" -conversion: meter / second --> meter / second\n",
"\n"
]
}
],
"source": [
"era5_model = metobs_toolkit.default_GEE_datasets['ERA5-land']\n",
"era5_model.get_info()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "16",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.665787Z",
"iopub.status.busy": "2025-05-14T11:45:06.665239Z",
"iopub.status.idle": "2025-05-14T11:45:06.676467Z",
"shell.execute_reply": "2025-05-14T11:45:06.675195Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'temp': ModelObstype instance of temp,\n",
" 'pressure': ModelObstype instance of pressure,\n",
" 'wind': ModelObstype_Vectorfield instance of wind}"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#or by accessing the attribute\n",
"era5_model.modelobstypes\n"
]
},
{
"cell_type": "markdown",
"id": "17",
"metadata": {},
"source": [
"As an example, we will create a new ModelObstype that represents the accumulated precipitation as is present in the ERA5_land GEE dataset."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "18",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.681264Z",
"iopub.status.busy": "2025-05-14T11:45:06.680390Z",
"iopub.status.idle": "2025-05-14T11:45:06.694687Z",
"shell.execute_reply": "2025-05-14T11:45:06.693430Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'temp': ModelObstype instance of temp,\n",
" 'pressure': ModelObstype instance of pressure,\n",
" 'wind': ModelObstype_Vectorfield instance of wind,\n",
" 'cumulated_precip': ModelObstype instance of cumulated_precip}"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"from datetime import datetime\n",
"#Create a new observation type\n",
"precipitation = metobs_toolkit.Obstype(obsname='cumulated_precip',\n",
" std_unit='mm',\n",
" description='Cumulated total precipitation since midnight per squared meter')\n",
"\n",
"#Create the ModelObstype\n",
"precip_in_era5 = metobs_toolkit.ModelObstype(\n",
" obstype=precipitation,\n",
" model_band='total_precipitation', #look this up: https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ERA5_LAND_HOURLY#bands \n",
" model_unit='m',\n",
" )\n",
"# Add it to the ERA5 model\n",
"era5_model.add_modelobstype(precip_in_era5)\n",
"\n",
"era5_model.modelobstypes"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "19",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.697876Z",
"iopub.status.busy": "2025-05-14T11:45:06.697563Z",
"iopub.status.idle": "2025-05-14T11:45:06.703636Z",
"shell.execute_reply": "2025-05-14T11:45:06.702860Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"================================================================================\n",
" General info of ModelObstype \n",
"================================================================================\n",
"\n",
"\n",
"--- Obstype info ---\n",
"\n",
"cumulated_precip observation with:\n",
" -standard unit: millimeter\n",
" -description: Cumulated total precipitation since midnight per squared mete...\n",
"\n",
"--- Model related info ---\n",
"\n",
" -corresponding bandname: total_precipitation\n",
" -original modeldata unit: 1.000 meter\n",
"\n"
]
}
],
"source": [
"precip_in_era5.get_info()"
]
},
{
"cell_type": "markdown",
"id": "20",
"metadata": {},
"source": [
"Now you can extract cumulated precipitation data directly from GEE. We refer to the [GEE Notebook](gee_example.ipynb) for an example on extracting ERA5 data."
]
},
{
"cell_type": "markdown",
"id": "21",
"metadata": {},
"source": [
"### ModelObstype_Vectorfield\n",
"At a specific height, the wind can be seen (by approximation) as a 2D vector field. The vector components are often stored in different bands/variables in a model.\n",
"\n",
"For example, if you want the 10m windspeed from ERA5 you cannot find a band for the windspeed. There are bands for the\n",
"u and v component of the wind. \n",
"\n",
"The `ModelObstype_Vectorfield` class represents a modelobstype, for which there does not exist a band, but can be constructed from (orthogonal) components. The vector amplitudes and direction are computed, and the corresponding `ModelObstype`'s are created.\n",
"\n",
"By default, the *wind* is added as a `ModelObstype_vectorfield` for the ERA5-land `GeeDynamicDataset`."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "22",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.707762Z",
"iopub.status.busy": "2025-05-14T11:45:06.707228Z",
"iopub.status.idle": "2025-05-14T11:45:06.726768Z",
"shell.execute_reply": "2025-05-14T11:45:06.720729Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/plain": [
"{'temp': ModelObstype instance of temp,\n",
" 'pressure': ModelObstype instance of pressure,\n",
" 'wind': ModelObstype_Vectorfield instance of wind,\n",
" 'cumulated_precip': ModelObstype instance of cumulated_precip}"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"era5_model.modelobstypes"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "23",
"metadata": {
"execution": {
"iopub.execute_input": "2025-05-14T11:45:06.736593Z",
"iopub.status.busy": "2025-05-14T11:45:06.735359Z",
"iopub.status.idle": "2025-05-14T11:45:06.752625Z",
"shell.execute_reply": "2025-05-14T11:45:06.750901Z"
}
},
"outputs": [
{
"data": {
"text/html": [
"\n",
" \n",
" "
],
"text/plain": [
""
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"================================================================================\n",
" General info of ModelObstype_Vectorfield \n",
"================================================================================\n",
"\n",
"\n",
"--- Obstype info ---\n",
"\n",
"wind observation with:\n",
" -standard unit: meter / second\n",
" -description: 2D-vector combined 10m windspeed. Care should be taken when c...\n",
"\n",
"--- Model related info ---\n",
"\n",
" -U-component bandname: u_component_of_wind_10m\n",
" -in meter / second\n",
" -V-component bandname: v_component_of_wind_10m\n",
" -in meter / second\n",
"\n"
]
}
],
"source": [
"era5_wind = era5_model.modelobstypes['wind']\n",
"era5_wind.get_info()"
]
},
{
"cell_type": "markdown",
"id": "24",
"metadata": {},
"source": [
"So we can see that *wind* corresponds with two bands (the u and v component)."
]
},
{
"cell_type": "markdown",
"id": "25",
"metadata": {},
"source": [
"When extracting the wind data from era5 (on GEE) the toolkit will\n",
" 1. Download the u and v wind components for your period and locations.\n",
" 2. Convert each component to its standard units (m/s for the wind components).\n",
" 3. Compute the amplitude and the direction (in degrees from North, clockwise).\n",
" 4. Add a `ModelObstype` for the amplitude and one for the direction.\n",
"\n",
" For an example, see the [GEE Notebook](gee_example.ipynb)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "metobs_dev",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.10"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
"state": {
"6101f8135943461abd0a87e4dfd6dec9": {
"model_module": "@jupyter-widgets/base",
"model_module_version": "2.0.0",
"model_name": "LayoutModel",
"state": {
"_model_module": "@jupyter-widgets/base",
"_model_module_version": "2.0.0",
"_model_name": "LayoutModel",
"_view_count": null,
"_view_module": "@jupyter-widgets/base",
"_view_module_version": "2.0.0",
"_view_name": "LayoutView",
"align_content": null,
"align_items": null,
"align_self": null,
"border_bottom": null,
"border_left": null,
"border_right": null,
"border_top": null,
"bottom": null,
"display": null,
"flex": null,
"flex_flow": null,
"grid_area": null,
"grid_auto_columns": null,
"grid_auto_flow": null,
"grid_auto_rows": null,
"grid_column": null,
"grid_gap": null,
"grid_row": null,
"grid_template_areas": null,
"grid_template_columns": null,
"grid_template_rows": null,
"height": null,
"justify_content": null,
"justify_items": null,
"left": null,
"margin": null,
"max_height": null,
"max_width": null,
"min_height": null,
"min_width": null,
"object_fit": null,
"object_position": null,
"order": null,
"overflow": null,
"padding": null,
"right": null,
"top": null,
"visibility": null,
"width": "500px"
}
}
},
"version_major": 2,
"version_minor": 0
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}