metobs_toolkit.qc_collection.whitelist.WhiteSet#

class WhiteSet(white_records: Index = Index([], dtype='object'))[source]#

Whitelist container for multiple stations and observation types.

This class manages a collection of whitelisted records across multiple stations and observation types. It uses a pandas Index or MultiIndex with optional levels for ‘name’ (station), ‘obstype’, and ‘datetime’ to define which records should be excluded from outlier detection in QC checks.

Parameters:

white_records (pd.Index, optional) – Index with levels ‘name’, ‘obstype’, and/or ‘datetime’ defining whitelisted records. Default is an empty Index.

Notes

  • The white_records index must contain at least one of: ‘name’, ‘obstype’, or ‘datetime’ as level names. If ‘datetime’ is not present, all timestamps for matching station/obstype combinations are whitelisted.

  • Timezone handling: If a ‘datetime’ level is present:

    • Timezone-aware timestamps are automatically converted to UTC

    • Timezone-naive timestamps are localized to UTC with a warning

    • It is strongly recommended to provide timezone-aware timestamps to avoid ambiguity

__init__(white_records: Index = Index([], dtype='object')) None[source]#

Initialize the WhiteSet with a collection of whitelisted records.

Parameters:

white_records (pandas.Index, optional) – Index (or MultiIndex) whose names are a subset of ['name', 'obstype', 'datetime']. Records matching this index are excluded from outlier detection. Default is an empty pandas.Index.

Methods

__init__([white_records])

Initialize the WhiteSet with a collection of whitelisted records.

create_sensorwhitelist(stationname, obstype)

Create a sensor-specific whitelist for a station and observation type.

get_info([printout])

Retrieve and optionally print detailed information about the WhiteSet.