Introduction to the MetObs-toolkit#

In this introduction, you will learn the principal components and methods in the MetObs-toolkit. Let’s start by importing it.

Since this package is under development, it is often relevant to know the precise version of the toolkit.

[1]:

import metobs_toolkit

#Print out the version of the toolkit
print(metobs_toolkit.__version__)
import xarray as xr

1.0.0a13

The Dataset class#

The Dataset class is for most applications the most important class. It holds all your stations and it’s data. Thus a Dataset is in principal a collection of stations.

Since raw data files often include observations from multiple stations, we import our raw data always directly into a Dataset. We use the Dataset.import_data_from_file() method, to import the raw data into a Dataset.

A key component for importing raw data, is a description of what your data represents and how it is formatted. This is done by providing a template file, that describes how your raw data is structured.

Importing your raw data#

As an example we will import a demo file of raw observations. In order to do that we need to :

Create a template file for our raw data file. The build_template_prompt() function will guide you in this process. It will ask questions, once you answered them a template file is created. It will also propose some code that you use to import your data
Create a Dataset instance
Add the raw data into the Dataset.

[2]:

# Specify the path to your raw data file (we use the demo file as example)
path_to_datafile=metobs_toolkit.demo_datafile

# We will also use a metadata file
path_to_metadatafile=metobs_toolkit.demo_metadatafile

[3]:

%%script true

#Create a template for these data files
metobs_toolkit.build_template_prompt()

[4]:

#specify the path to the templatefile that was created
path_to_templatefile=metobs_toolkit.demo_template #demo file as example!!

Now that we have the datafiles and the templatefile, we create an empty Dataset, and import the data into it.

[5]:

dataset = metobs_toolkit.Dataset() #Create a new dataset object

#Load the data
dataset.import_data_from_file(
                    template_file=path_to_templatefile, #The template file
                    input_data_file=path_to_datafile, #The data file
                    input_metadata_file=path_to_metadatafile, #The metadata file
                    )

Luchtdruk is present in the datafile, but not found in the template! This column will be ignored.
Neerslagintensiteit is present in the datafile, but not found in the template! This column will be ignored.
Neerslagsom is present in the datafile, but not found in the template! This column will be ignored.
Rukwind is present in the datafile, but not found in the template! This column will be ignored.
Luchtdruk_Zeeniveau is present in the datafile, but not found in the template! This column will be ignored.
Globe Temperatuur is present in the datafile, but not found in the template! This column will be ignored.
The following columns are present in the data file, but not in the template! They are skipped!
 ['Luchtdruk', 'Luchtdruk_Zeeniveau', 'Globe Temperatuur', 'Neerslagsom', 'Rukwind', 'Neerslagintensiteit']
The following columns are found in the metadata, but not in the template and are therefore ignored:
['stad', 'benaming', 'sponsor', 'Network']

As can be seen in the printed logs, there is a lot going on when importing the data. That is because tests are applied on your data to check for gaps, and mismatches between data and metadata.

We can now inspect the ´dataset´ further.

The attributes#

The attributes are holding the data of the dataset. Here we present some attributes that can be useful to inspect.

All classes in the MetObs-toolkit have a get_info() methods that prints out an overview of its content.

Dataset.obstypes : A collection of Obstypes that are known. These observationtypes describe a measurable quantity, and its corresponding units.

[6]:

dataset.obstypes

[6]:

{'temp': Obstype(id=temp_degree_Celsius),
 'humidity': Obstype(id=humidity_percent),
 'radiation_temp': Obstype(id=radiation_temp_degree_Celsius),
 'pressure': Obstype(id=pressure_hectopascal),
 'pressure_at_sea_level': Obstype(id=pressure_at_sea_level_hectopascal),
 'precip': Obstype(id=precip_millimeter / meter ** 2),
 'precip_sum': Obstype(id=precip_sum_millimeter / meter ** 2),
 'wind_speed': Obstype(id=wind_speed_meter / second),
 'wind_gust': Obstype(id=wind_gust_meter / second),
 'wind_direction': Obstype(id=wind_direction_degree)}

[7]:

#Note! The known obstypes are NOT the obstypes for which there are observations.
#To get the obstypes for which there are observations, use:
dataset.present_observations

[7]:

['humidity', 'temp', 'wind_direction', 'wind_speed']

Dataset.template: A template class, that is automatically set up by using the template file. This is only used when data is imported from a file. It has no further use.

[8]:

template = dataset.template

template.get_info() # Prints out how the template maps raw data

================================================================================
                            General info of Template
================================================================================


--- Data obstypes map ---

  -temp: Temperatuur
    -raw data in degC
    -description: 2mT passive
  -humidity: Vochtigheid
    -raw data in percent
    -description: 2m relative humidity passive
  -wind_speed: Windsnelheid
    -raw data in km/h
    -description: Average 2m  10-min windspeed
  -wind_direction: Windrichting
    -raw data in degrees
    -description: Average 2m  10-min windspeed, north is zero in CW direction...

--- Data extra mapping info ---

  -name column (data) <---> Vlinder

--- Data timestamp map ---

  -datetimecolumn <---> None
  -time_column <---> Tijd (UTC)
  -date_column <---> Datum
  -fmt <---> %Y-%m-%d %H:%M:%S
  -Timezone <---> UTC

--- Metadata map ---

  -name <---> Vlinder
  -lat <---> lat
  -lon <---> lon
  -school <---> school

dataset.df: A pandas DataFrame holding all the observation records.

[9]:

dataset.df

[9]:

			value	label
datetime	obstype	name
2022-09-01 00:00:00+00:00	humidity	vlinder01	65.000000	ok
		vlinder02	62.000000	ok
		vlinder03	65.000000	ok
		vlinder04	66.000000	ok
		vlinder05	61.000000	ok
...	...	...	...	...
2022-09-15 23:55:00+00:00	wind_speed	vlinder24	0.000000	ok
		vlinder25	1.972222	ok
		vlinder26	0.027778	ok
		vlinder27	0.000000	ok
		vlinder28	0.000000	ok

483840 rows × 2 columns

dataset.metadf: A pandas DataFrame holding all the metadata of the stations.

[10]:

dataset.metadf

[10]:

	lat	lon	altitude	LCZ	school	geometry
name
vlinder01	50.980438	3.815763	NaN	NaN	UGent	POINT (3.81576 50.98044)
vlinder02	51.022381	3.709695	NaN	NaN	UGent	POINT (3.7097 51.02238)
vlinder03	51.324581	4.952109	NaN	NaN	Heilig Graf	POINT (4.95211 51.32458)
vlinder04	51.335522	4.934732	NaN	NaN	Heilig Graf	POINT (4.93473 51.33552)
vlinder05	51.052654	3.675183	NaN	NaN	Sint-Barbara	POINT (3.67518 51.05265)
vlinder06	51.027100	4.516300	NaN	NaN	BimSem	POINT (4.5163 51.0271)
vlinder07	51.030888	4.478445	NaN	NaN	PTS	POINT (4.47845 51.03089)
vlinder08	51.028130	4.477398	NaN	NaN	TSM	POINT (4.4774 51.02813)
vlinder09	50.927166	4.075722	NaN	NaN	SMI	POINT (4.07572 50.92717)
vlinder10	50.935555	4.041389	NaN	NaN	SMI	POINT (4.04139 50.93555)
vlinder11	51.222424	4.381726	NaN	NaN	Sint-Annacollege	POINT (4.38173 51.22242)
vlinder12	51.216476	4.423440	NaN	NaN	UGent	POINT (4.42344 51.21648)
vlinder13	51.212212	4.398065	NaN	NaN	UGent	POINT (4.39807 51.21221)
vlinder14	51.350616	4.315013	NaN	NaN	UGent	POINT (4.31501 51.35062)
vlinder15	50.935299	4.192600	NaN	NaN	Sint-Martinus	POINT (4.1926 50.9353)
vlinder16	51.266850	4.293436	NaN	NaN	Sint-Maarten	POINT (4.29344 51.26685)
vlinder17	51.065269	5.613458	NaN	NaN	Sint-Augustinusinstituut Bree	POINT (5.61346 51.06527)
vlinder18	51.136246	5.656769	NaN	NaN	TISM Bree	POINT (5.65677 51.13625)
vlinder19	50.841454	4.363672	NaN	NaN	UGent	POINT (4.36367 50.84145)
vlinder20	50.847027	4.357971	NaN	NaN	UGent	POINT (4.35797 50.84703)
vlinder21	51.260387	2.991917	NaN	NaN	Zeelyceum	POINT (2.99192 51.26039)
vlinder22	50.989502	2.856220	NaN	NaN	‘t Saam	POINT (2.85622 50.9895)
vlinder23	51.260578	3.580151	NaN	NaN	Richtpunt Eeklo	POINT (3.58015 51.26058)
vlinder24	51.167015	3.572062	NaN	NaN	OLV ten Doorn	POINT (3.57206 51.16702)
vlinder25	51.154720	3.708611	NaN	NaN	Einstein Atheneum	POINT (3.70861 51.15472)
vlinder26	51.161758	4.997653	NaN	NaN	Sint Dimpna	POINT (4.99765 51.16176)
vlinder27	51.058098	3.728067	NaN	NaN	Sec. Kunstinstituut	POINT (3.72807 51.0581)
vlinder28	51.035294	3.769741	NaN	NaN	GO! Ath.	POINT (3.76974 51.03529)

Station class#

The stationclass is a representatio of a station. A station holds the following:

Station.sensordata: Timeseries of an observation type. A station can hold multiple sensordata, one for each sensor.
Station.site: Each station has a ´Site´ attribute, that holds the information on the location of the station. Metadata related to the station is also stored here.
Station.modeldata: In addition to the observations, modeldata timeseries representing the station can be stored. In pracktice, if one would download ERA5 data (using the MetObs-toolkit), the timeseries are stored as modeldata in the Station.

To select a station, one can use the name of the station, which is assumed to be unique for each station.

All the methods and attributes that are present in the Dataset are also applicable on the Station! Thus if your script works on Dataset-level, it also works on station-level.

Only the Dataset.sync_records(), Dataset.buddy_check(), and trivial Dataset-only methods (i.g. Dataset.get_station()) are not defined for Stations.

[11]:

#Select a station
your_station = dataset.get_station('vlinder02')

#Print out some details
your_station.get_info()

================================================================================
                            General info of Station
================================================================================


--- Observational info ---

Station instance with:
  -humidity:
    -humidity observations in percent
    -from 2022-09-01 00:00:00+00:00 -> 2022-09-15 23:55:00+00:00
    -At a resolution of 0 days 00:05:00
    -No outliers present.
    -2 gaps present, a total of 3 missing timestamps.
      -label counts:
        -gap: 3
  -temp:
    -temp observations in degree_Celsius
    -from 2022-09-01 00:00:00+00:00 -> 2022-09-15 23:55:00+00:00
    -At a resolution of 0 days 00:05:00
    -No outliers present.
    -2 gaps present, a total of 3 missing timestamps.
      -label counts:
        -gap: 3
  -wind_direction:
    -wind_direction observations in degree
    -from 2022-09-01 00:00:00+00:00 -> 2022-09-15 23:55:00+00:00
    -At a resolution of 0 days 00:05:00
    -No outliers present.
    -2 gaps present, a total of 3 missing timestamps.
      -label counts:
        -gap: 3
  -wind_speed:
    -wind_speed observations in meter / second
    -from 2022-09-01 00:00:00+00:00 -> 2022-09-15 23:55:00+00:00
    -At a resolution of 0 days 00:05:00
    -No outliers present.
    -2 gaps present, a total of 3 missing timestamps.
      -label counts:
        -gap: 3

--- Metadata info ---

  -Coordinates (51.022379, 3.709695) (latitude, longitude)
  -Altitude is unknown
  -LCZ is unknown
  -Land cover fractions are unknown
  -Extra metadata from the metadata file:
    -school: UGent

--- Modeldata info ---

  -Station instance without model data.

[12]:

# Inspecting the attributes of the station

#Print out info on the Site of the station:
your_station.site.get_info()

================================================================================
                              General Info of Site
================================================================================

Site of vlinder02:
  -Coordinates (51.022379, 3.709695) (latitude, longitude)
  -Altitude is unknown
  -LCZ is unknown
  -Land cover fractions are unknown
  -Extra metadata from the metadata file:
    -school: UGent

[13]:

# All observational data is stored as SensorData

print(your_station.get_sensor('temp'))

# More convenient is to use the pandas dataframe representations,
# similar as with the Dataset

your_station.df

temp data of station vlinder02.

[13]:

		value	label
datetime	obstype
2022-09-01 00:00:00+00:00	humidity	62.000000	ok
	temp	19.400000	ok
	wind_direction	25.000000	ok
	wind_speed	0.194444	ok
2022-09-01 00:05:00+00:00	humidity	62.000000	ok
...	...	...	...
2022-09-15 23:50:00+00:00	wind_speed	0.000000	ok
2022-09-15 23:55:00+00:00	humidity	83.000000	ok
	temp	12.900000	ok
	wind_direction	295.000000	ok
	wind_speed	0.000000	ok

17280 rows × 2 columns

[14]:

#Or the metadata for this singel station
your_station.metadf

[14]:

	lat	lon	altitude	LCZ	school	geometry
name
vlinder02	51.022381	3.709695	NaN	NaN	UGent	POINT (3.7097 51.02238)

Plotting timeseries#

Plotting the timeseries can be simply done by using the make_plot() method, on a Dataset or a Station.

[15]:

dataset.make_plot(obstype='temp', #Which observation type to plot. (See dataset.present_observations)
                  colorby='station', #if 'station', each station will be a different color
                  show_outliers=True,
                  show_gaps=True)

[15]:

<Axes: title={'center': 'temp data.'}, xlabel='Timestamps (in UTC)', ylabel='temp (degree_Celsius)'>

../_images/examples_introduction_28_2.png

[16]:

#We can also plot a single station
your_station.make_plot(obstype='humidity',
                       colorby='label') #If 'label', the colors are based on the status/label of an observation.

[16]:

<Axes: title={'center': 'humidity data for station vlinder02'}, xlabel='Timestamps (in UTC)', ylabel='humidity (percent)'>

../_images/examples_introduction_29_2.png

Common usecases#

Here a collection of common usecases.

Resampling time resolution#

It is common to change or alter the time resolution of your observations. This is often applied when:

the data amount is to big, and the present time resolution is not required for the analysis.
sensor do not have the same time resolution. (i.g. temperature is measured every 5 minutes, but precipitation is measured each hour.)
Observations are not synchronized over multiple stations. This is a special case of resampling, since there is also a synchronization required.

It is recommended to set the target time resolution, in the beginning of your pipeline!

In the MetObs-toolkit you can resample by using the resample() method on a Dataset or Station. By doing so, the toolkit will construct a set of target timestamps (in the new resolution), and will map the raw timestamps to the new target timestamps. There is no interpolation applied!

In order to construct the mapping of the old timestamps to the target timestamps, a tolerance is used. The nearest timestamp is tested if it is within the tolerance of the target timestamp. If this test is not successful, no record could be assigned to the target timestamp and thus a gap is created. Thus by increasing the shift_tolerance, the resampling method will have more mapped timestamps thus less gaps but at the cost of less accurate timestamps.

[17]:

hourly_dataset = metobs_toolkit.Dataset()
#Load the data (raw data has 5 min resolution)
hourly_dataset.import_data_from_file(
                    template_file=path_to_templatefile, #The template file
                    input_data_file=path_to_datafile, #The data file
                    input_metadata_file=path_to_metadatafile, #The metadata file
                    )
#Resample to 1 hour resolution
hourly_dataset.resample(target_freq='1h', #Target frequency is set to 1 hour
                        obstype=None, #if None, all present observations are resampled
                        shift_tolerance='4min', #The maximum shift allow for a timestamp
                        origin_simplify_tolerance='3min') # The maximum shift for the origin, to get a simplified origin

# You can verify that the resolution is hourly by inspecting the df attribute
hourly_dataset.df.index

WARNING:<metobs_toolkit>:Luchtdruk is present in the datafile, but not found in the template! This column will be ignored.
WARNING:<metobs_toolkit>:Neerslagintensiteit is present in the datafile, but not found in the template! This column will be ignored.
WARNING:<metobs_toolkit>:Neerslagsom is present in the datafile, but not found in the template! This column will be ignored.
WARNING:<metobs_toolkit>:Rukwind is present in the datafile, but not found in the template! This column will be ignored.
WARNING:<metobs_toolkit>:Luchtdruk_Zeeniveau is present in the datafile, but not found in the template! This column will be ignored.
WARNING:<metobs_toolkit>:Globe Temperatuur is present in the datafile, but not found in the template! This column will be ignored.
WARNING:<metobs_toolkit>:The following columns are present in the data file, but not in the template! They are skipped!
 ['Luchtdruk', 'Luchtdruk_Zeeniveau', 'Globe Temperatuur', 'Neerslagsom', 'Rukwind', 'Neerslagintensiteit']
WARNING:<metobs_toolkit>:The following columns are found in the metadata, but not in the template and are therefore ignored:
['stad', 'benaming', 'sponsor', 'Network']
WARNING:<metobs_toolkit>:The present gaps are removed, new gaps are constructed for temp data of station vlinder02..
WARNING:<metobs_toolkit>:The present gaps are removed, new gaps are constructed for wind_direction data of station vlinder02..
WARNING:<metobs_toolkit>:The present gaps are removed, new gaps are constructed for wind_speed data of station vlinder02..
WARNING:<metobs_toolkit>:The present gaps are removed, new gaps are constructed for humidity data of station vlinder02..

[17]:

MultiIndex([('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder01'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder02'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder03'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder04'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder05'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder06'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder07'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder08'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder09'),
            ('2022-09-01 00:00:00+00:00',   'humidity', 'vlinder10'),
            ...
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder19'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder20'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder21'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder22'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder23'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder24'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder25'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder26'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder27'),
            ('2022-09-15 23:00:00+00:00', 'wind_speed', 'vlinder28')],
           names=['datetime', 'obstype', 'name'], length=40320)

Dataframe of one observationtype#

The Dataset.df and Station.df returns a pandas dataframe with a so called Multi-Index. That is because the combination of [´timestamp´, ´observationtype´, ‘stationname´] defines an observation, thus the use of the Multi-Index.

We are aware that working with Multi-Indexed dataframes can be challenging, thus an example on how to convert a multiindex dataframe to a regular-indexed dataframe.

Be aware that removing (or reducing) the Multi-Index, is always a subsetting or approximation.

[18]:

#Subset to only temperatures (=subsetting)

temperatures = dataset.df.xs(key='temp',
                             level='obstype', #the level of the index ('datetime', 'name' or 'obstype')
                             drop_level=True)

#You can see that the index now only has 2-levels:
temperatures

[18]:

		value	label
datetime	name
2022-09-01 00:00:00+00:00	vlinder01	18.799999	ok
	vlinder02	19.400000	ok
	vlinder03	17.000000	ok
	vlinder04	15.900000	ok
	vlinder05	21.100000	ok
...	...	...	...
2022-09-15 23:55:00+00:00	vlinder24	11.100000	ok
	vlinder25	14.100000	ok
	vlinder26	13.300000	ok
	vlinder27	14.300000	ok
	vlinder28	13.000000	ok

120960 rows × 2 columns

[19]:

#If we assume that all the temperature observations over all the stations have the same
#set of timestamps (typical after resampling! ), we can create a dataframe with all stations represented by columns.

temperatures_wide = (dataset.df
                    #first subset to temperatures
                    .xs(key='temp',
                            level='obstype', #the level of the index ('datetime', 'name' or 'obstype')
                            drop_level=True)
                    #Convert a index level to columns (unstacking)
                    .unstack(level='name'))
temperatures_wide

[19]:

	value										...	label
name	vlinder01	vlinder02	vlinder03	vlinder04	vlinder05	vlinder06	vlinder07	vlinder08	vlinder09	vlinder10	...	vlinder19	vlinder20	vlinder21	vlinder22	vlinder23	vlinder24	vlinder25	vlinder26	vlinder27	vlinder28
datetime
2022-09-01 00:00:00+00:00	18.799999	19.400000	17.000000	15.9	21.1	17.700001	18.1	19.200001	18.000000	19.100000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
2022-09-01 00:05:00+00:00	18.799999	19.400000	16.900000	15.8	21.1	17.700001	18.1	19.100000	18.000000	19.000000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
2022-09-01 00:10:00+00:00	18.799999	19.299999	16.799999	15.8	21.1	17.600000	18.0	19.100000	17.900000	18.900000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
2022-09-01 00:15:00+00:00	18.700001	19.200001	16.700001	15.6	21.1	17.500000	18.0	19.000000	17.799999	18.900000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
2022-09-01 00:20:00+00:00	18.700001	19.200001	16.600000	15.4	21.1	17.500000	18.1	19.000000	17.700001	18.799999	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2022-09-15 23:35:00+00:00	13.200000	13.300000	12.200000	9.1	17.4	13.200000	13.4	14.400000	13.200000	14.300000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
2022-09-15 23:40:00+00:00	13.100000	13.200000	12.200000	9.6	17.4	13.100000	13.4	14.300000	13.100000	14.200000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
2022-09-15 23:45:00+00:00	13.000000	13.100000	12.200000	9.8	17.4	13.000000	13.3	14.300000	13.000000	14.200000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
2022-09-15 23:50:00+00:00	12.900000	13.000000	12.300000	10.0	17.4	13.100000	13.3	14.200000	13.000000	14.200000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok
2022-09-15 23:55:00+00:00	12.900000	12.900000	12.400000	10.2	17.4	13.100000	13.2	14.200000	13.000000	14.100000	...	ok	ok	ok	ok	ok	ok	ok	ok	ok	ok

4320 rows × 56 columns

[20]:

#if you are only interested in the value, you can select them:
temperatures_wide['value']

[20]:

name	vlinder01	vlinder02	vlinder03	vlinder04	vlinder05	vlinder06	vlinder07	vlinder08	vlinder09	vlinder10	...	vlinder19	vlinder20	vlinder21	vlinder22	vlinder23	vlinder24	vlinder25	vlinder26	vlinder27	vlinder28
datetime
2022-09-01 00:00:00+00:00	18.799999	19.400000	17.000000	15.9	21.1	17.700001	18.1	19.200001	18.000000	19.100000	...	18.700001	19.400000	19.299999	18.799999	18.0	18.200001	18.900000	17.900000	19.600000	17.799999
2022-09-01 00:05:00+00:00	18.799999	19.400000	16.900000	15.8	21.1	17.700001	18.1	19.100000	18.000000	19.000000	...	18.600000	19.400000	19.299999	18.799999	18.0	18.200001	18.500000	17.700001	19.600000	17.799999
2022-09-01 00:10:00+00:00	18.799999	19.299999	16.799999	15.8	21.1	17.600000	18.0	19.100000	17.900000	18.900000	...	18.600000	19.299999	19.200001	18.700001	18.0	18.100000	18.299999	17.500000	19.500000	17.700001
2022-09-01 00:15:00+00:00	18.700001	19.200001	16.700001	15.6	21.1	17.500000	18.0	19.000000	17.799999	18.900000	...	18.500000	19.299999	19.200001	18.600000	18.0	18.000000	18.200001	17.299999	19.400000	17.799999
2022-09-01 00:20:00+00:00	18.700001	19.200001	16.600000	15.4	21.1	17.500000	18.1	19.000000	17.700001	18.799999	...	18.500000	19.200001	19.200001	18.299999	18.0	17.900000	18.100000	17.100000	19.299999	17.799999
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
2022-09-15 23:35:00+00:00	13.200000	13.300000	12.200000	9.1	17.4	13.200000	13.4	14.400000	13.200000	14.300000	...	14.500000	15.000000	15.700000	12.100000	13.9	11.700000	14.200000	13.400000	14.500000	13.400000
2022-09-15 23:40:00+00:00	13.100000	13.200000	12.200000	9.6	17.4	13.100000	13.4	14.300000	13.100000	14.200000	...	14.500000	15.000000	15.700000	12.100000	13.9	11.600000	14.200000	13.400000	14.500000	13.300000
2022-09-15 23:45:00+00:00	13.000000	13.100000	12.200000	9.8	17.4	13.000000	13.3	14.300000	13.000000	14.200000	...	14.400000	14.900000	15.600000	12.100000	13.8	11.400000	14.200000	13.400000	14.400000	13.200000
2022-09-15 23:50:00+00:00	12.900000	13.000000	12.300000	10.0	17.4	13.100000	13.3	14.200000	13.000000	14.200000	...	14.300000	14.900000	15.700000	12.000000	13.9	11.300000	14.200000	13.400000	14.400000	13.200000
2022-09-15 23:55:00+00:00	12.900000	12.900000	12.400000	10.2	17.4	13.100000	13.2	14.200000	13.000000	14.100000	...	14.300000	14.800000	15.600000	11.900000	13.9	11.100000	14.100000	13.300000	14.300000	13.000000

4320 rows × 28 columns

Quality control#

For more details, refer to the Quality Control Example Notebook.

Extracting data from Google Earth Engine#

For an introduction to extracting data for GEE, we refer to the Using Google Earth Engine demo.

Filling gaps#

For an introduction to filling gaps, we refer to the Filling gaps demo.

Analysis#

For an introduction to analyzing your dataset, we refer to the Analysis demo.