Development¶
Managing Requirements¶
adding a new requirement¶
For production requirements add them to requirements.in, for development requirements
add them to requirements-dev.in. Then run for production requirements:
uv pip compile --no-annotate requirements.in -o requirements.txt
…and for development requirements
uv pip compile --no-annotate requirements-dev.in -o requirements-dev.txt
This will add the new requirements but will not upgrade all others
upgrading existing requirements¶
To upgrade the existing production requirements run:
uv pip compile --upgrade --no-annotate requirements.in -o requirements.txt
…and the development requirements:
uv pip compile --upgrade --no-annotate requirements-dev.in -o requirements-dev.txt
This also done automatically once a week in a GitHub Actions workflow which creates a PR.
Database¶
Views¶
The database implements its own manually-managed views so incremental refreshes are supported. Only the most recent data is refreshed every five minutes. However, to make the system self-healing, once a day all views are fully refreshed.
A code-generating tool was developed to generate hourly and daily views based on the raw
data. A pre-commit hook ensures the everything stays in sync (/bin/generate_view.py).
Migrations¶
This system uses alembic for database migrations. Make sure you generate/implement a migration for every change made to the database.
You can create a new migration by running:
alembic revision --autogenerate -m "<message_what_changed>"
If the database was just created by using the latest schema, you have to stamp it by running:
alembic stamp head
Warning
Running migration can result in the loss of data. For example when an upgrade removes a column, it will also remove the data. A downgrade, however, will only restore the column but not the data previously stored in the column. This will have to be restored from a backup.
You can upgrade the database for the latest schema by running:
alembic upgrade head
You can downgrade the database to the previous schema by running:
alembic downgrade -1
Deployment¶
The deployment is implemented in ansible. The repository is private. The full deployments workflow will be adapted upon transfer of the system to a new home.
It currently requires a (virtual) machine with the following specs
8x CPU
16 GB RAM
32 GB HDD for the OS
3 TB HDD for the data (the raster data is quite big. If only the measurement data API is needed, this can be much less. 1 year of data roughly equals 5 GB in the database, static data is around 1 GB)
Debian-based OS (currently Ubuntu 24.04 - noble)
ansible-playbook d2r.yml
you will be prompted for the become (sudo) password
you will be prompted for the vault password.
the vault contains all secrets needed for the deployment
.env.prodfileSSL-Certificates
Environment Variables¶
An example of all necessary environment variables can be found in .env.dev and may be
adapted for production use.
Variable |
Description |
|---|---|
|
Absolute path to the data directory on the host machine. This will be used to store the model output rasters and temperature/relative humidity interpolation rasters. For example |
|
Absolute path to the directory on the host machine where all static images for the |
|
Provider string of the database used. Since this uses timescale, only postgres-like engines are supported. For example |
|
Host name of the database (container) usually corresponds to the name of the database container. In this case |
|
Port the database is listening on. For postgres this is usually |
|
Database user to use to connect to the database. For example |
|
Password of above |
|
Name of the database. If it does not exist, it is created using this name. For example |
|
Sentry data source name (DSN) What the DSN Does. This is optional, if not set, no errors are reported. |
|
Sentry traces sample rate (what % of transactions should be send to sentry [0 - 1]) traces_sample_rate. |
|
URL to the celery broker/task queue that distributes tasks. This is usually |
|
Time limit of a single task in seconds. If this is exceeded the task is killed. Usually |
|
API-Key for the Element IoT platform operated by DOData. This will have to have read scope on the |
|
Database provider for the terracotta instance. In this case also |
|
Name of the database for terracotta. If it does not exist, it is created using this name. For example |
|
Host name of the database (container) usually corresponds to the name of the database container. In this case |
|
Sentry traces sample rate (what % of transactions should be send to sentry [0 - 1]) for the terracotta service traces_sample_rate. The same |
|
The resampling method to be used when reading reprojected raster data. |
|
The resampling method to be used when reading reprojected raster data to web mercator. |
|
Directory on the host machine, where the SSL certificates are stored |
|
The number of days (retention period) to keep model rasters on the machine and database. For example |