Demo: Custom observation types#

In this demo, you can find a demonstration on how to use Observation types.

[1]:
import metobs_toolkit

#Initialize an empty Dataset
your_dataset = metobs_toolkit.Dataset()

Default observation types#

An observation record must always be linked to an observation type which is specified by the Obstype class. An Obstype represents one observation type (i.g. temperature), and it handles unit conversions and string representations of an observation type.

By default, a set of standard observationtypes are stored in a Dataset:

[2]:
your_dataset.obstypes
[2]:
{'temp': Obstype(id=temp_degree_Celsius),
 'humidity': Obstype(id=humidity_percent),
 'radiation_temp': Obstype(id=radiation_temp_degree_Celsius),
 'pressure': Obstype(id=pressure_hectopascal),
 'pressure_at_sea_level': Obstype(id=pressure_at_sea_level_hectopascal),
 'precip': Obstype(id=precip_millimeter / meter ** 2),
 'precip_sum': Obstype(id=precip_sum_millimeter / meter ** 2),
 'wind_speed': Obstype(id=wind_speed_meter / second),
 'wind_gust': Obstype(id=wind_gust_meter / second),
 'wind_direction': Obstype(id=wind_direction_degree)}

The Obstype class#

As an example we take a look at the temperature obstype, which is an instance of theObstype class.

[3]:
temperature_obstype = your_dataset.obstypes['temp']
temperature_obstype.get_info()
================================================================================
                            General info of Obstype
================================================================================

temp observation with:
  -standard unit: degree_Celsius
  -description: 2m - temperature

The most important attribute of an Obstype is it’s standard unit. That is the unit to transform and store values in. For temperature this is by default set to degrees Celsius.

[4]:
temperature_obstype.std_unit
[4]:
'degree_Celsius'

Creating a new observationtype#

In practice it is most common that a new observation type is defined before importing the raw dataset. When creating a template file with the metobs_toolkit.build_template_prompt() function, the prompt will in the end print out snippet of code that will create the new observation type.

As an illustration, a we will define a new observationtype for gas-concentrations.

[5]:
gas_concentration = metobs_toolkit.Obstype(
        name='gas_ratio',
        std_unit='ppm', #see all available units: https://github.com/hgrecco/pint/blob/master/pint/default_en.txt
        description='The gas concentration measured a 2m',
        )

If you have raw data with concentrations you add them before importing the data.

[6]:
your_dataset = metobs_toolkit.Dataset() #Initiate an empty dataset

your_dataset.add_new_observationtype(gas_concentration) #add the new observation

#your_dataset.import_data_from_file( ... )

Obstypes for Modeldata#

ModelObstype#

An extension to the Obstype class is the ModelObstype class which is used for interacting with GEE dataset. In addition to a regular Obstype a ModelObstype contains the info which band (of the GEE dataset) represents the observation, and handles the unit conversion.

Note: All methods that work on Obstype do also work on ModelObstype.

Since a ModelObstype is specific to a specific GEE dataset, the ModelObstypes are stored per GeeDatasetManager (= the class in the MetObs-toolkit that defines a Google Earth Engine dataset).

The following GEE dataset are define by default:

A ModelObstype is specific to one GEE dataset. Therefore the known modelobstypes are stored in each GeeDynamicDatasetManager. As a default, there is an ERA5-land GeeDynamicDatasetManager stored in all Datasets.

[7]:
metobs_toolkit.default_GEE_datasets
[7]:
{'LCZ': GEEStaticDatasetManager(name=LCZ, location=RUB/RUBCLIM/LCZ/global_lcz_map/latest),
 'altitude': GEEStaticDatasetManager(name=altitude, location=CGIAR/SRTM90_V4),
 'worldcover': GEEStaticDatasetManager(name=worldcover, location=ESA/WorldCover/v200),
 'ERA5-land': GEEDynamicDatasetManager(name=ERA5-land, location=ECMWF/ERA5_LAND/HOURLY)}

As an example we take a look in the ´ERA5-land´, which is a GeeDynamicDatasetManager representing the ERA5 dataset on GEE.

By using the get_info() (or by accessing the .modelobstypes attribute) we can see the present modelobstypes

[8]:
era5_model = metobs_toolkit.default_GEE_datasets['ERA5-land']
era5_model.get_info()
================================================================================
                       General info of GEEDynamicDataset
================================================================================


--- GEE Dataset details ---

  -name: ERA5-land
  -location: ECMWF/ERA5_LAND/HOURLY
  -value_type: numeric
  -scale: 2500
  -is_static: False
  -is_image: False
  -is_mosaic: False
  -credentials:
  -time res: 1h

--- Known Modelobstypes ---

  -temp : ModelObstype instance of temp
    -conversion: kelvin --> degree_Celsius
  -pressure : ModelObstype instance of pressure
    -conversion: pascal --> hectopascal
  -wind : ModelObstype_Vectorfield instance of wind
    -vectorfield that will be converted to:
      -wind_speed
      -wind_direction
    -conversion: meter / second --> meter / second

[9]:
#or by accessing the attribute
era5_model.modelobstypes

[9]:
{'temp': ModelObstype(id=temp_degree_Celsius_temperature_2m),
 'pressure': ModelObstype(id=pressure_hectopascal_surface_pressure),
 'wind': ModelObstype_Vectorfield(id=wind_meter / second_u_component_of_wind_10m_v_component_of_wind_10m)}

As an example, we will create a new ModelObstype that represents the accumulated precipitation as is present in the ERA5_land GEE dataset.

[10]:
import pandas as pd
from datetime import datetime
#Create a new observation type
precipitation = metobs_toolkit.Obstype(name='cumulated_precip',
                                      std_unit='mm',
                                      description='Cumulated total precipitation since midnight per squared meter')

#Create the ModelObstype
precip_in_era5 = metobs_toolkit.ModelObstype(
                        obstype=precipitation,
                        model_band='total_precipitation', #look this up: https://developers.google.com/earth-engine/datasets/catalog/ECMWF_ERA5_LAND_HOURLY#bands
                        model_unit='m',
               )
# Add it to the ERA5 model
era5_model.add_modelobstype(precip_in_era5)

era5_model.modelobstypes
[10]:
{'temp': ModelObstype(id=temp_degree_Celsius_temperature_2m),
 'pressure': ModelObstype(id=pressure_hectopascal_surface_pressure),
 'wind': ModelObstype_Vectorfield(id=wind_meter / second_u_component_of_wind_10m_v_component_of_wind_10m),
 'cumulated_precip': ModelObstype(id=cumulated_precip_millimeter_total_precipitation)}
[11]:
precip_in_era5.get_info()
================================================================================
                          General info of ModelObstype
================================================================================


--- Obstype info ---

cumulated_precip observation with:
  -standard unit: millimeter
  -description: Cumulated total precipitation since midnight per squared mete...

--- Model related info ---

  -corresponding bandname: total_precipitation
  -original modeldata unit: meter

Now you can extract cumulated precipitation data directly from GEE. We refer to the GEE Notebook for an example on extracting ERA5 data.

ModelObstype_Vectorfield#

At a specific height, the wind can be seen (by approximation) as a 2D vector field. The vector components are often stored in different bands/variables in a model.

For example, if you want the 10m windspeed from ERA5 you cannot find a band for the windspeed. There are bands for the u and v component of the wind.

The ModelObstype_Vectorfield class represents a modelobstype, for which there does not exist a band, but can be constructed from (orthogonal) components. The vector amplitudes and direction are computed, and the corresponding ModelObstype’s are created.

By default, the wind is added as a ModelObstype_vectorfield for the ERA5-land GeeDynamicDataset.

[12]:
era5_model.modelobstypes
[12]:
{'temp': ModelObstype(id=temp_degree_Celsius_temperature_2m),
 'pressure': ModelObstype(id=pressure_hectopascal_surface_pressure),
 'wind': ModelObstype_Vectorfield(id=wind_meter / second_u_component_of_wind_10m_v_component_of_wind_10m),
 'cumulated_precip': ModelObstype(id=cumulated_precip_millimeter_total_precipitation)}
[13]:
era5_wind = era5_model.modelobstypes['wind']
era5_wind.get_info()
================================================================================
                    General info of ModelObstype_Vectorfield
================================================================================


--- Obstype info ---

wind observation with:
  -standard unit: meter / second
  -description: 2D-vector combined 10m windspeed. Care should be taken when c...

--- Model related info ---

  -U-component bandname: u_component_of_wind_10m
    -in meter / second
  -V-component bandname: v_component_of_wind_10m
    -in meter / second

So we can see that wind corresponds with two bands (the u and v component).

When extracting the wind data from era5 (on GEE) the toolkit will

  1. Download the u and v wind components for your period and locations.

  2. Convert each component to its standard units (m/s for the wind components).

  3. Compute the amplitude and the direction (in degrees from North, clockwise).

  4. Add a ModelObstype for the amplitude and one for the direction.

For an example, see the GEE Notebook.