Mapping to the toolkit#

The MetObs-toolkit uses standard names and formats for your data. To use the toolkit, your observational data must be converted to the toolkit standards this is referred to as mapping.

To specify how the mapping must be done a template is used. This template contains all the information on how to convert your tabular data to the toolkit standards. A template is saved as a file (JSON file) and can be reused or shared. In practice you only need to make one template file, for your network.

On this page, you can find information on how to construct a template.

Raw data Structures#

To make a template you must be aware of which format your data is in. The toolkit can handle the following data structures:

Long-format#

Observations are stacked in rows per station. One column represents the station names.

Timestamp

2m Temperature

2m Humidity

ID

2022-06-07 13:20:00

16.4

77.3

Station_A

2022-06-07 13:30:00

16.7

75.6

Station_A

2022-06-07 13:20:00

18.3

68.9

Station_B

2022-06-07 13:30:00

18.6

71.9

Station_B

Single-station-format#

The same as a long format but without a column indicating the station names. Be aware that the toolkit interprets it as observations coming from one station.

Timestamp

2m Temperature

2m Humidity

2022-06-07 13:20:00

16.4

77.3

2022-06-07 13:30:00

16.7

75.6

Wide-format#

Columns represent different stations. The data represents one observation type.

Timestamp

Station_A

Station_B

2022-06-07 13:20:00

16.4

18.3

2022-06-07 13:30:00

16.7

18.6

Template creation#

Once you have converted your tabular data files to either long-, wide-, or single-station-format, and saved them as a .csv file, a template can be made.

Note: If you want to use a metadata file, make sure it is converted to a wide-format and saved as a .csv file.

The fastest and simplest way to make a template is by using the metobs_toolkit.build_template_prompt() function.

import metobs_toolkit

#create a template
metobs_toolkit.build_template_prompt()

This function will prompt questions and build a template that matches your data file (and metadata) file. The template.json file will be stored at a location of your choice.

Note: When the prompt asks if you need further help, and you type yes, some more questions are prompted. Once all information is given to the prompt, it will print out a piece of code that you have to run to load your data into the toolkit.

Use the template file when importing the raw data.

[1]:
import metobs_toolkit

dataset = metobs_toolkit.Dataset() #initiate an empty dataset
dataset.import_data_from_file(
    input_data_file= metobs_toolkit.demo_datafile, #Path to your data (csv) file
    input_metadata_file=metobs_toolkit.demo_metadatafile, #Path to your metadata (csv) file
    template_file=metobs_toolkit.demo_template) #Path to your template (json) file.

Luchtdruk is present in the datafile, but not found in the template! This column will be ignored.
Neerslagintensiteit is present in the datafile, but not found in the template! This column will be ignored.
Neerslagsom is present in the datafile, but not found in the template! This column will be ignored.
Rukwind is present in the datafile, but not found in the template! This column will be ignored.
Luchtdruk_Zeeniveau is present in the datafile, but not found in the template! This column will be ignored.
Globe Temperatuur is present in the datafile, but not found in the template! This column will be ignored.
The following columns are present in the data file, but not in the template! They are skipped!
 ['Neerslagsom', 'Neerslagintensiteit', 'Rukwind', 'Globe Temperatuur', 'Luchtdruk', 'Luchtdruk_Zeeniveau']
The following columns are found in the metadata, but not in the template and are therefore ignored:
['benaming', 'sponsor', 'Network', 'stad']

The template (file) is read when calling the Dataset.import_data_from_file() method, and converted to a metobs_toolkit.Template which is accessible for each dataset.

The template file is used to create a Template object, that will convert raw data to a standard format interpretable by the toolkit. This object is stored as a Dataset.template attribute. It is only used when importing raw data, and has no further use.

[2]:
dataset.template
[2]:
<metobs_toolkit.template.Template at 0x7f343c2d9820>

An overview of the template can be printed using the show() on the Template instance:

[3]:
dataset.template.get_info() #Get information about the template
================================================================================
                            General info of Template
================================================================================


--- Data obstypes map ---

  -temp: Temperatuur
    -raw data in degC
    -description: 2mT passive
  -humidity: Vochtigheid
    -raw data in percent
    -description: 2m relative humidity passive
  -wind_speed: Windsnelheid
    -raw data in km/h
    -description: Average 2m  10-min windspeed
  -wind_direction: Windrichting
    -raw data in degrees
    -description: Average 2m  10-min windspeed, north is zero in CW direction...

--- Data extra mapping info ---

  -name column (data) <---> Vlinder

--- Data timestamp map ---

  -datetimecolumn <---> None
  -time_column <---> Tijd (UTC)
  -date_column <---> Datum
  -fmt <---> %Y-%m-%d %H:%M:%S
  -Timezone <---> UTC

--- Metadata map ---

  -name <---> Vlinder
  -lat <---> lat
  -lon <---> lon
  -school <---> school