metobs_toolkit.gap.Gap.interpolate#
- Gap.interpolate(sensordata: SensorData, method: str = 'time', max_gap_duration_to_fill: pd.Timedelta = Timedelta('0 days 03:00:00'), n_leading_anchors: int = 1, n_trailing_anchors: int = 1, max_lead_to_gap_distance: pd.Timedelta | None = None, max_trail_to_gap_distance: pd.Timedelta | None = None, method_kwargs: dict = {}) None[source]#
Fill the gap using interpolation of SensorData.
The gap is interpolated using the leading and trailing periods of the gap. One can select different interpolation methods. By using restrictions on the leading and trailing periods, one can ensure that the interpolation is only done when there are enough leading and trailing data available.
- Parameters:
sensordata (SensorData) – The corresponding SensorData used to interpolate the gap.
method (str, optional) – Interpolation technique to use. See pandas.DataFrame.interpolate ‘method’ argument for possible values. Make sure that n_leading_anchors, n_trailing_anchors and method_kwargs are set accordingly to the method (higher order interpolation techniques require more leading and trailing anchors). The default is “time”.
max_gap_duration_to_fill (pandas.Timedelta, optional) – The maximum gap duration of to fill with interpolation. The result is independent on the time-resolution of the gap. Defaults to 3 hours.
n_leading_anchors (int, optional) – The number of leading anchors to use for the interpolation. A leading anchor is a near record (not rejected by QC) just before the start of the gap, that is used for interpolation. Higher-order interpolation techniques require multiple leading anchors. Defaults to 1.
n_trailing_anchors (int, optional) – The number of trailing anchors to use for the interpolation. A trailing anchor is a near record (not rejected by QC) just after the end of the gap, that is used for interpolation. Higher-order interpolation techniques require multiple leading anchors. Defaults to 1.
max_lead_to_gap_distance (pandas.Timedelta or None, optional) – The maximum time difference between the start of the gap and a leading anchor(s). If None, no time restriction is applied on the leading anchors. The default is None.
max_trail_to_gap_distance (pandas.Timedelta or None, optional) – The maximum time difference between the end of the gap and a trailing anchor(s). If None, no time restriction is applied on the trailing anchors. Defaults to None.
method_kwargs (dict, optional) – Extra arguments that are passed to pandas.DataFrame.interpolate() structured in a dict. Defaults to {}.
Notes
A schematic description:
Get the leading and trailing periods of the gap.
Check if the leading and trailing periods are valid.
Create a combined DataFrame with the leading, trailing, and gap data.
Interpolate the missing records using the specified method.
Update the gap attributes with the interpolated values, labels, and details.
Note
If you want to use a higher-order method of interpolation, make sure to increase the n_leading_anchors and n_trailing_anchors accordingly. For example, for a cubic interpolation, you need at least 2 leading and 2 trailing anchors.