module qc

The implementation of the quality control.

class app.qc.BuddyCheckConfig[source]

Configuration for the buddy check and isolation check implemented via titanlib.buddy_check() and titanlib.isolation_check().

callable: Callable[..., ndarray[tuple[Any, ...], dtype[integer]]]
elev_gradient: float
max_elev_diff: float
min_std: float
num_iterations: int
num_min: int
radius: float
threshold: float
async app.qc.apply_buddy_check(data, config) DataFrame[source]

Apply the buddy check to the data for the given time period.

Parameters:

data (DataFrame) – The data to apply the buddy check to. It must have a

Return type:

DataFrame

Returns:

A DataFrame with the buddy check results as flags.

async app.qc.apply_qc(data, station_id) DataFrame[source]

Apply the quality control to the data for a given station and time period.

This function applies various quality control checks to the data, such as range_check(), persistence_check(), and spike_dip_check().

Parameters:
  • data (DataFrame) – The data to apply quality control to.

  • station – The station to apply quality control for.

Return type:

DataFrame

Returns:

The data with the quality control applied as flags.

async app.qc.calculate_qc_score(data) pd.Series[float][source]

Calculate the quality control score for the data.

Parameters:

data – The data to calculate the quality control score for.

Returns:

A Series with the quality control score for each row.

async app.qc.persistence_check(s, *, window, excludes=[], station, con, **kwargs) pd.Series[bool][source]

Check if the values in the series are persistent.

For this we need to get more data from the database so that we can check if the values are the same for window minutes.

Parameters:

s – The pandas Series to check, which must have a pd.DateTimeIndex.

Returns:

A boolean Series indicating whether each value is persistent.

async app.qc.range_check(s, *, lower_bound, upper_bound, **kwargs) pd.Series[bool][source]

Check if the values in the series are within the specified range.

Parameters:
  • s – The pandas Series to check, which must have a pd.DateTimeIndex.

  • lower_bound – The lower bound of the range.

  • upper_bound – The upper bound of the range.

Returns:

A boolean Series indicating whether each value is within the range.

async app.qc.spike_dip_check(s, *, delta, station, con, **kwargs) pd.Series[bool][source]

check if there are spikes or dips in the data.

For this we need to get more data from the database so that we can check if the values spike more than delta.

Parameters:
  • s – The pandas Series to check, which must have a pd.DateTimeIndex.

  • delta – The threshold for the spike/dip check per minute.

  • station – The station to check.

  • con – The database connection to use.

Returns:

A boolean Series indicating whether each value is a spike or dip.