Welcome to the ceilometer-toolbox documentation!¶
ceilometer-toolbox¶
This is a unified collection of state-of-the-art tools for processing ceilometer data. This makes use of the following tools:
stratfinder (Kotthaus et al. 2020)
It builds a file tree to easily store and access data from multiple sensors, hiding multi-file and multi-folder complexity and making it easily accessible from python.
Docs: https://rubclim.github.io/ceilometer-toolbox/
installation¶
via https
pip install git+https://github.com/RUBclim/ceilometer-toolbox
via ssh
pip install git+ssh://git@github.com/RUBclim/ceilometer-toolbox
Getting started¶
Locate the root folder where all ceilometer data is stored in. It is important that the date matches this format:
{prefix}{file_date:%Y%m%d}_*.nc. If a custom way of deriving raw data between two dates is needed, theCeilometer.glob_day_raw_datamethods needs to be overridden after inheriting fromCeilometer.├── ceilometer-data │ ├── live_20260217_150920.nc │ ├── live_20260217_151420.nc │ ├── live_20260217_151920.nc ...
Create a
CeilometerArchiveinstance and point it at a folder where you want to store the data.from ceilometer_toolbox import CeilometerArchive archive = CeilometerArchive('ceilometer-output')
Create a
Ceilometerinstance and pass the previously createdarchiveto it.from ceilometer_toolbox import Ceilometer ceilometer = Ceilometer( device_id='IA', input_dir='ceilometer-input', archive=archive, raw2l1_config_file='example_configs/raw2l1_cl61.conf', stratfinder_config_file='example_configs/stratfinder_settings_cl61.json', stratfinder_qc_value_config_file='example_configs/values_qc.toml', stratfinder_qc_metadata_file='example_configs/STRATFINDER_metadata.toml', )
You may provide all config files for the respective tools when creating the instance, they will be used for processing in the respective steps, may, however, also be overwritten. Please see the respective tool for a full documentation on the configuration.
Now start processing the raw data to L1:
ceilometer.process_raw_files(start_date='2026-05-06', end_date='2026-05-07', jobs=1)
This will run
raw2l1, reading from theinput_dirspecified.jobscan control concurrency which will spawn multiple processes running raw2l1 in parallel. Note that this is an IO-heavy tasks. Excessively high concurrency may lead to slower performance. Especially when the target or source is a mounted network drive.Now run
stratfinderon the L1 data. This cannot be run in parallel, since it depends on files from the previous day, which may not be ready. By default this will run stratfinder in docker via the matlab runtime. If you have stratfinder already setup locally, you may passin_docker=Falseand set theexecutable_path=.... Then no docker is needed.ceilometer.process_l1_files(start_date='2026-05-06')
For this step you will have to have
dockerinstalled and thestratfinderimage built. Please see STRATfinder-docker for instructionFinally run the quality control on the
stratfinderoutputceilometer.process_stratfinder_qc(start_date='2026-05-06')
Now a file tree should be present (
device_id→year→month→day/file type):├── ceilometer-output │ └── IA │ └── 2026 │ └── 05 │ ├── 20260503_L1.nc │ ├── 20260503_L2A_beta.nc │ ├── 20260503_L2A_stratfinder.nc │ └── 20260503_L2B_stratfinder.nc
Accessing data¶
The data is stored in a tree-like structure so filesystem performance remains high and
access to ranges of data is fast. The CeilometerArchive instance allows interaction
with the file tree, fully hiding its complexity.
Reading¶
Any range of data can be accessed with a context manager like this:
with archive.open_dataset(
device_id='IA',
file_type='L2A_stratfinder',
start_date=datetime(2026, 5, 1),
end_date=datetime(2026, 5, 3),
) as ds:
...
This will find and read all files needed to cover the range. This uses dask and this
way avoids reading all files into memory at once, hence, long time periods can be loaded
without the need for a lot of RAM.
The archive may be queried for the latest date of a file type e.g. to determine where to continue processing.
archive.latest_date(
device_id='IA',
file_type='L1',
)
To e.g. speedup search one can set from_date for a start point in time to start
looking backwards for the latest file of the specified type. This may be needed if
historical data should be processed.
You can also retrieve the raw list of files that cover the date ranges specified if you
want to handle them manually using the get_files method.
To check if an individual file exists you may use the get_file_or_none method.
archive.get_file_or_none(device_id='IA', file_type='L1', file_date=datetime(2026, 5, 1))
This will return the full path to the matching file or None when the file does not
exist.
Writing¶
Adding files to the file tree can be done via put_file or atomic_put_file.
archive.put_file(
device_id='IA',
file_type='L1',
file_date=datetime(2026, 5, 1),
override=True
)
Since raw2l1 writes values consecutively to the file, this is not atomic. Trying to run any other call on a partially written file will fail.
To ensure atomic writing use the atomic_put_file context manager. Which will first
write to a temporary file and once finished replace/create the final file atomically.
The context manager yields the file name of the temporary path. This may be passed to
the tool as output file.
with archive.atomic_put_file(
device_id='IA',
file_type='L1',
file_date=datetime(2026, 5, 1),
override=True
) as f:
...
Files may also be deleted using the delete method.
Plotting data¶
The toolbox also comes with simple plotting functions for plotting $\beta$ and the linear depolarization ratio (CL61).
ceilometer.beta_plot(
start_date=datetime(2026, 4, 28),
end_date=datetime(2026, 5, 2),
show_mlh=True,
show_ablh=True,
show_cbh=True,
alt_max=2500,
output_path='beta_plot.png',
)

This automatically applies resampling (nearest) to allow plotting longer time series
This can, however, be changes by passing a different function via resampler= e.g.
using averages instead which are computationally much more expensive. The QC-Flags are
automatically taken into account and excluded, unless you set filter_qc=False.
The maximum altitude can be set via alt_max. The linear depolarization plot has a
similar interface, however, omitting the MLH, ABLH and CBH options.
ceilometer.ldr_plot(
start_date=datetime(2026, 4, 28),
end_date=datetime(2026, 5, 2),
alt_max=2500,
output_path='ldr_plot.png',
)

References¶
Kotthaus, S., Haeffelin, M., Drouin, M.-A., Dupont, J.-C., Grimmond, S., Haefele, A., Hervo, M., Poltera, Y., & Wiegner, M. (2020). Tailored Algorithms for the Detection of the Atmospheric Boundary Layer Height from Common Automatic Lidars and Ceilometers (ALC). Remote Sensing, 12(19), 3259. https://doi.org/10.3390/rs12193259