ceilometer_toolbox

class ceilometer_toolbox.Ceilometer(device_id, input_dir, archive, raw2l1_config_file=None, stratfinder_config_file=None, stratfinder_qc_value_config_file=None, stratfinder_qc_metadata_file=None) None[source]

Class for processing ceilometer data and making plots.

beta_plot(start_date, end_date, output_path, alt_max=None, show_mlh=False, show_ablh=False, show_cbh=False, filter_qc=True, resampler=<function resample_dataset>, **kwargs) Figure[source]

Make a plot of the backscatter coefficient (beta) over time

Parameters:
  • start_date (datetime) – The start date of the plot.

  • end_date (datetime) – The end date of the plot.

  • output_path (str) – The path to save the plot to.

  • alt_max (int | None) – The maximum altitude to plot. If None, the maximum altitude in the dataset will be used. (default: None)

  • show_mlh (bool) – Whether to show the mixed layer height (MLH) on the plot. (default: False)

  • show_ablh (bool) – Whether to show the aerosol boundary layer height (ABLH) on the plot. (default: False)

  • show_cbh (bool) – Whether to show the cloud base height (CBH) on the plot. (default: False)

  • filter_qc (bool) – Whether to filter the MLH, ABLH, and CBH values based on the quality flag. If True, only values with a quality flag of 0 and a precipitation flag of 0 will be shown. (default: True)

  • kwargs (dict[str, Any]) – Additional keyword arguments to pass to the xarray plotting function. This can be used to customize the plot, e.g. by changing the colormap or the colorbar settings.

Return type:

Figure

Returns:

The figure object of the plot.

glob_day_raw_data(file_date, prefix) list[str][source]

Glob the raw ceilometer files for a given date and prefix.

Parameters:
  • file_date (date) – The date of the files to glob.

  • prefix (str) – The prefix of the files to glob. This is usually live_ for raw files, but may be different if the naming convention is different. The glob pattern is {prefix}{file_date:%Y%m%d}_*.nc.

Return type:

list[str]

Returns:

list of matching file paths (unsorted)

ldr_plot(start_date, end_date, output_path, alt_max=None, resampler=<function resample_dataset>, **kwargs) Figure[source]

Make a plot of the linear depolarisation ratio (LDR) over time

Parameters:
  • start_date (datetime) – The start date of the plot.

  • end_date (datetime) – The end date of the plot.

  • output_path (str) – The path to save the plot to.

  • alt_max (int | None) – The maximum altitude to plot. If None, the maximum altitude in the dataset will be used. (default: None)

  • kwargs (dict[str, Any]) – Additional keyword arguments to pass to the xarray plotting function. This can be used to customize the plot, e.g. by changing the colormap or the colorbar settings.

Return type:

Figure

Returns:

The figure object of the plot.

process_l1_files(start_date=None, end_date=None, config_file=None, directory_mount=None, in_docker=True, executable_path=None) int[source]

Process the L1 files for the given date and all subsequent dates until end_date using the stratfinder algorithm.

Parameters:
  • start_date (date | str | None) – The date to start processing from. (default: None)

  • end_date (date | str | None) – The date to stop processing at. If None, processing will continue until the current date. (default: None)

  • config_file (str | None) – The path to the stratfinder configuration file (json). (default: None)

  • directory_mount (str | None) – The directory to mount in the Docker container. (default: None)

  • in_docker (bool) – Whether to run stratfinder in a Docker container or use a local executable. (default: True)

  • executable_path (str | None) – The path to the local stratfinder executable. This is only used if in_docker is False. This should be the bash script that is provided along with the stratfinder Matlab distribution. (default: None)

Return type:

int

process_raw_files(start_date=None, end_date=None, prefix='live_', jobs=1, config_file=None) int[source]

Process raw ceilometer files since a given date and convert them to level 1.

Parameters:
  • start_date (date | str | None) – The date to start processing from. This can be a date object or a string in the format YYYY-MM-DD. If None, processing will start from the most recently processed L1 date already in the archive (defaults to 1970-01-01 if no L1 files exist yet). (default: None)

  • end_date (date | str | None) – The date to stop processing at. This can be a date object or a string in the format YYYY-MM-DD. If None, processing will continue until the current date. (default: None)

  • prefix (str) – The prefix of the raw files to process. This is usually live_ (default: 'live_')

  • jobs (int) – The number of parallel processes to use for processing the files. (default: 1)

  • config_file (str | None) – Option to override the raw2l1 configuration file provided in the class initialization. (default: None)

Return type:

int

process_stratfinder_qc(start_date=None, end_date=None, config_file=None, value_config_file=None, stratfinder_metadata_file=None) int[source]
Process the stratfinder output files for the given date and all subsequent

dates until end_date using the stratfinder QC algorithm.

This cannot be run in parallel since it depends on the output of the previous day.

Parameters:
  • archive – The CeilometerArchive instance to use for reading and writing files.

  • start_date (date | str | None) – The date to start processing from. (default: None)

  • end_date (date | str | None) – The date to stop processing at. If None, processing will continue until the current date. (default: None)

  • config_file (str | None) – The path to the stratfinder QC config file (json). (default: None)

  • value_config_file (str | None) – The path to the value config file (toml) for the stratfinder QC. (default: None)

  • stratfinder_metadata_file (str | None) – The path to the stratfinder metadata file (toml) for the stratfinder QC. (default: None)

Return type:

int

static stratfinder_in_docker(today_file, output_file, beta_file, config_file, yesterday_file=None, overlap_file=None, container_image='stratfinder:latest', directory_mount=None) int[source]
Run the stratfinder algorithm in a Docker container. This cannot be run in

parallel since it depends on the output of the previous day.

This is necessary because the stratfinder algorithm is implemented in Matlab and requires the Matlab Runtime to run.

Parameters:
  • config_file (str) – The path to the stratfinder configuration file (json)

  • today_file (str) – The path to the input file for the current day to process. This should be a L1 file output from the raw2l1 tool.

  • output_file (str) – Path to the output file for the stratfinder results.

  • beta_file (str) – The path to the output file for the beta results outputted by stratfinder.

  • yesterday_file (str | None) – The path to the input file for the previous day to process. This should be a L1 file output from the raw2l1 tool. (default: None)

  • overlap_file (str | None) – The path to the input file for the overlap correction. This can be omitted if no overlap correction is desired. (default: None)

  • container_image (str) – The name of the Docker image to use for running stratfinder. Please see: https://github.com/RUBclim/STRATfinder-docker (default: 'stratfinder:latest')

  • directory_mount (str | None) – The directory to mount in the Docker container. This should be an absolute path. If None, the current working directory will be used. The input and output files should be located in this directory or its subdirectories. (default: None)

Return type:

int

static stratfinder_local(executable_path, today_file, output_file, beta_file, config_file, yesterday_file=None, overlap_file=None) int[source]

Run the stratfinder algorithm locally. This cannot be run in parallel since it depends on the output of the previous day.

Parameters:
  • executable_path (str) – The path to the stratfinder executable. This should be the bash script that is provided along with the stratfinder Matlab distribution.

  • config_file (str) – The path to the stratfinder configuration file (json)

  • today_file (str) – The path to the input file for the current day to process. This should be a L1 file output from the raw2l1 tool.

  • output_file (str) – Path to the output file for the stratfinder results.

  • beta_file (str) – The path to the output file for the beta results outputted by stratfinder.

  • yesterday_file (str | None) – The path to the input file for the previous day to process. This should be a L1 file output from the raw2l1 tool. (default: None)

  • overlap_file (str | None) – The path to the input file for the overlap correction. This can be omitted if no overlap correction is desired. (default: None)

Return type:

int

to_l1(file_date, input_files, output_file, config_file=None, ancillary_files=[], min_file_size=0, check_timeliness=False, filter_max_age=2, filter_day=False, log_file=None, log_level='info', verbose='info') int[source]

Convert raw ceilometer files to level 1 using the raw2l1 tool.

Parameters:
  • file_date (date) – The date of the files to process

  • config_file (str | None) – The path to the raw2l1 configuration file (default: None)

  • input_files (str | list[str]) – The raw files to process, can be a single file or a list of files

  • output_file (str) – The path to the output file

  • ancillary_files (str | list[str]) – The ancillary files to use, can be a single file or a list of files (default: [])

  • min_file_size (int) – The minimum size of input file in bytes. Files with a smaller size will be rejected. (default: 0)

  • check_timeliness (bool) – Check if the data read are not to old or in the future. By default it checks thats data have a maximum age of 2 hours. This value can be changed with option file_max_age. (default: False)

  • filter_max_age (int) – Allow to define the maximum age of data in a file in hours (default: 2)

  • filter_day (bool) – Only keep data of date provided as arguments (default: False)

  • log_file (str | None) – File where logs will be saved (default: None)

  • log_level (str) – Level of logs store in the log file. Choices are debug, info, warning, error, critical (default: 'info')

  • verbose (str) – Level of verbose in the terminal. Same choices as log_level (default: 'info')

Return type:

int

Returns:

The return code of the raw2l1 tool, 0 if successful, non-zero otherwise

class ceilometer_toolbox.CeilometerArchive(root_dir) None[source]

Query ceilometer output files stored as daily NetCDF files.

Expected directory layout: <root>/<device_id>/<YYYY>/<MM>/<YYYYMMDD>_<file_type>.nc

VALID_FILE_TYPES: frozenset[Literal['L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder']] = frozenset({'L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder'})
atomic_put_file(device_id, file_type, file_date, override=False) Generator[str][source]

Prepare one archive output path for atomic publication.

This is the atomic variant of put_file(). It yields a temporary path in the target directory and atomically publishes to the canonical archive path on successful context exit.

Parameters:
  • device_id (str) – station/device ID

  • file_type (Literal['L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder']) – one of L1, L2A_beta, L2A_stratfinder or L2B_stratfinder

  • file_date (str | date | datetime) – target file date

  • override (bool) – allow replacing existing target file when True (default: False)

Return type:

Generator[str]

Returns:

iterator yielding temporary staged output path

Raises:
delete_file(device_id, file_type, file_date) bool[source]

Delete one archive file from the tree.

Parameters:
  • device_id (str) – station/device ID

  • file_type (Literal['L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder']) – one of L1, L2A_beta, L2A_stratfinder or L2B_stratfinder

  • file_date (str | date | datetime) – file date to delete

Return type:

bool

Returns:

True if a file was deleted, otherwise False

get_file_or_none(device_id, file_type, file_date) str | None[source]

Return the single file path for one device, type and date.

Parameters:
  • device_id (str) – station/device ID

  • file_type (Literal['L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder']) – one of L1, L2A_beta, L2A_stratfinder or L2B_stratfinder

  • file_date (str | date | datetime) – target file date

Return type:

str | None

Returns:

full path to the matching file, or None if missing

get_files(device_id=None, file_type=None, start_date=None, end_date=None) list[str][source]

Return list variant of iter_files() for convenience.

Accepts the same arguments as iter_files() and raises the same exceptions.

Return type:

list[str]

iter_files(device_id=None, file_type=None, start_date=None, end_date=None) Iterator[str][source]

Yield file paths for file type and an inclusive date interval.

Defaults: - device_id=None -> all available devices - file_type=None -> all supported file types - start_date/end_date=None -> min/max available dates in the archive

Parameters:
  • device_id (str | None) – optional station/device ID filter (default: None)

  • file_type (Literal['L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder'] | None) – optional file type filter (L1, L2A_beta, L2A_stratfinder, L2B_stratfinder) (default: None)

  • start_date (str | date | datetime | None) – optional inclusive lower bound (default: None)

  • end_date (str | date | datetime | None) – optional inclusive upper bound (default: None)

Return type:

Iterator[str]

Returns:

iterator over matching file paths

Raises:

ValueError – if file type is unsupported or start_date > end_date

latest_date(device_id, file_type, from_date=None, max_depth_days=3660) date | None[source]

Return the newest available date with stop-early backward traversal.

Parameters:
  • device_id (str) – station/device ID

  • file_type (Literal['L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder']) – one of L1, L2A_beta, L2A_stratfinder or L2B_stratfinder

  • from_date (str | date | datetime | None) – optional start point for backward search (defaults to today) (default: None)

  • max_depth_days (int) – maximum number of days to look back (inclusive) (default: 3660)

Return type:

date | None

Returns:

latest available date, or None if no file exists

Raises:

ValueError – if max_depth_days is negative

open_dataset(device_id, file_type, start_date=None, end_date=None, **kwargs) Generator[Dataset][source]

Open matching files as one xarray dataset and slice by date range.

Parameters:
  • device_id (str) – station/device ID

  • file_type (Literal['L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder']) – one of L1, L2A_beta, L2A_stratfinder or L2B_stratfinder

  • start_date (str | date | datetime | None) – optional inclusive lower bound (default: None)

  • end_date (str | date | datetime | None) – optional inclusive upper bound (default: None)

  • kwargs (Any) – additional keyword arguments passed to xarray.open_mfdataset

Return type:

Generator[Dataset]

Returns:

xarray dataset containing the selected time range

Raises:
put_file(device_id, file_type, file_date, override=False) str[source]

Prepare one archive file path and create missing directories.

This method only resolves and prepares the path for downstream tools. It does not write file contents.

Parameters:
  • device_id (str) – station/device ID

  • file_type (Literal['L1', 'L2A_beta', 'L2A_stratfinder', 'L2B_stratfinder']) – one of L1, L2A_beta, L2A_stratfinder or L2B_stratfinder

  • file_date (str | date | datetime) – target file date

  • override (bool) – allow existing file path when True (default: False)

Return type:

str

Returns:

prepared full path

Raises: