Welcome to isithot documentation!

Installation

via https

pip install git+https://github.com/RUBclim/isithot

via ssh

pip install git+ssh://git@github.com/RUBclim/isithot

Quick start

An initial app can be create quite simple by adding a single data provider.

Adding data providers

  1. Add a new isithot.DataProvider instance. The DataProvider.get_current_data() and DataProvider.get_daily_data() methods need to be implemented.

  2. Create a isithot.ColumnMapping instance which maps the columns of your data source to the columns the package expects.

  3. Create a dictionary of isithot.DataProvider where the key must match the id.

  4. register the data providers with the current app app.config['DATA_PROVIDERS'] = data_providers.

from datetime import date

import pandas as pd
from flask import Flask
from isithot import ColumnMapping
from isithot import create_app
from isithot import DataProvider
from isithot.config import Config


class TestProvider(DataProvider):
    def get_current_data(self, d: date) -> pd.DataFrame:
        df = pd.DataFrame({
            'date': [pd.Timestamp(d)],
            'temp_max': [30.0],
            'temp_min': [20.0],
        })
        return df.set_index('date')

    def get_daily_data(self, d: date) -> pd.DataFrame:
        x = pd.read_csv(
            'testing/monthly_input_data/lmss_daily_long.csv',
            parse_dates=['date'],
            index_col='date',
        )
        x['doy'] = x.index.dayofyear
        return x


def my_app() -> Flask:
    col_map = ColumnMapping(
        datetime='date',
        temp_mean='temp_mean_mannheim',
        temp_max='temp_max',
        temp_min='temp_min',
        day_of_year='doy',
    )

    data_providers = {
        'test': TestProvider(
            col_mapping=col_map,
            name='Test',
            id='test',
            min_year=2010,
        ),
    }

    app = create_app(Config)
    app.config['DATA_PROVIDERS'] = data_providers
    return app


if __name__ == '__main__':
    app = my_app()
    app.run(debug=True)

implementing caching

The isithot app comes with caches that can be added to a function. E.g. the daily data will likely not changes very often, hence we can cache it for e.g. one hour.

from isithot.cache import cache

class TestProvider(DataProvider):
    @cache.cached(timeout=60*60, key_prefix='daily_data')
    def get_daily_data(self, d: date) -> pd.DataFrame:
        ...

more complex data retrieval

An example for a more complex example can be found in testing/example_app.py which uses database queries. All implementations need to consider performance since this is executed during handling of the http request.

Another option for data retrieval is the server performing and API request e.g.

    def get_current_data(self, d: date) -> DataFrame:
        """
        fetch the latest weather data from the DWD. ``self.id`` corresponds to the
        station ID by DWD which is set during DataProvider creation.
        """
        ret = urllib.request.urlopen(
            f'https://dwd.api.proxy.bund.dev/v30/stationOverviewExtended?stationIds={self.id}',
            timeout=3,
        )
        data = current_app.json.loads(ret.read())
        temp_min = data[self.id]['days'][0]['temperatureMin'] / 10
        temp_max = data[self.id]['days'][0]['temperatureMax'] / 10
        date = datetime.strptime(
            data[self.id]['days'][0]['dayDate'], '%Y-%m-%d',
        )
        return pd.DataFrame(
            {
                self.col_mapping.temp_min: temp_min,
                self.col_mapping.temp_max: temp_max,
            },
            index=pd.DatetimeIndex([date], name=self.col_mapping.datetime),
        )

API-Documentation

i18n

This web-app uses internationalization (i18n) to also have this page available in german, since the audience will mostly be german. This is setup via Babel and all english text (both, in .py and .html files) is wrapped in _(...) a function. This can be extracted automatically via:

pybabel extract -F babel.cfg -o isithot/translations/messages.pot .

This will generate a messages.pot file which is the basis for all translations. Based on this a translation can be initialized with this command. In this case this is for German (de).

pybabel init -i isithot/translations/messages.pot -d isithot/translations/ -l de

This will now create a subfolder for the specific language (in this case de for German). The messages.pot can now be used to translate all messages.

Finally, the languages have to be compiled into a messages.mo file. This needs to be done manually for testing. It is done automatically for production while building the docker image.

pybabel compile -d isithot/translations

Important

If there are changes made to any of the strings (in the .py or .html file that are wrapped in a _(...) function) the .pot file needs to be updated using these commands:

pybabel extract -F babel.cfg -o isithot/translations/messages.pot .
pybabel update -i isithot/translations/messages.pot -d isithot/translations

app

isithot.app.create_app(config)[source]

create and configure the isithot Flask application.

Parameters:

config (object) – Configuration object to use for the Flask app.

Return type:

Flask

Returns:

Configured Flask application instance.

blueprints

isithot.blueprints.isithot.get_locale()[source]

utility for getting the lang from the Language-Accept header

Return type:

str | None

Returns:

the language key - either de or en

isithot.blueprints.isithot.index()[source]

A simple route to have nicer link to share.

Return type:

Response

isithot.blueprints.isithot.last_years_calendar(station, year)[source]

Returns the calendar figure data as json for the specified year.

This route is cached indefinitely and does not take the locale into account, since it’s only static data.

Parameters:
  • station (str) – The station a plot is created for.

  • year (int) – The year a plot is created for.

Return type:

str

isithot.blueprints.isithot.plots(station)[source]

Renders the isithot page with all plots.

This route is cached since compiling the data and generating the plots is quite expensive. The cache expires after 5 minutes hence it is still almost live data.

Parameters:

station (str) – The station a plot is created for.

Return type:

str

class isithot.blueprints.plots.ColumnMapping(datetime: str, temp_mean: str, temp_max: str, temp_min: str, day_of_year: str)[source]

Class for defining the columns mapping the different parameters needed

Parameters:
  • datetime – the column name of the column that stores the date (and maybe time) information

  • temp_mean – the column name of the column that stores the average air-temperature information

  • temp_max – the column name of the column that stores the maximum air-temperature information

  • temp_min – the column name of the column that stores the minimum air-temperature information

  • day_of_year – the column name of the column that stores the day of year number

datetime: str

Alias for field number 0

day_of_year: str

Alias for field number 4

temp_max: str

Alias for field number 2

temp_mean: str

Alias for field number 1

temp_min: str

Alias for field number 3

class isithot.blueprints.plots.DataProvider(col_mapping, name, id, min_year)[source]

Base Class for defining a custom data provider. get_daily_data() and get_current_data() need to be overridden.

Parameters:
  • col_mapping (ColumnMapping) – a ColumnMapping() mapping the column names returned by get_daily_data() or get_current_data() to variables so they can be used later

  • name (str) – the name of the station that is displayed on the website

  • id (str) – the ID of the station that is used for compiling links. If multiple DataProviders are used, each one must have a unique station_id.

  • min_year (int) – the minimum year for which data is available. This is used to determine the first year for which a calendar plot is created.

calendar_fig(calendar_data)[source]

Creates a figures representing a calendar plot of the current year indicating the percentile of each day as a color and a number.

Parameters:

calendar_data (DataFrame) – a pd.DataFrame() containing all data necessary for creating the plot

Return type:

Figure

Returns:

a Figure() object that can be used as a json on the page, defining the plot including all data

distrib_fig(fig_data)[source]

Creates a figures representing the distribution with 5% and 95% percentile and the trends for the time of year and the overall warming trend.

Parameters:

fig_data (PlotData) – a PlotData() object containing all data necessary for creating the plot

Return type:

Figure

Returns:

a Figure() object that can be used as a json on the page, defining the plot including all data

get_current_data(d)[source]

This needs to be implemented and most likely be a database query or a file that is read. It might makes sense to cache this function. d may be used as a cache-key.

This should return a pd.DataFrame() with columns containing:

  • date (as a datetime object)

  • maximum temperature

  • minimum temperature

The index must be a pd.DatetimeIndex() The column names must match those defined via col_mapping

Parameters:

d (date) – the date for which to prepare data. This will usually be today

Return type:

DataFrame

get_daily_data(d)[source]

This needs to be implemented and most likely be a database query or a file that is read. It might makes sense to cache this function. d may be used as a cache-key.

This should return a pd.DataFrame() with columns containing:

  • date a datetime object

  • mean temperature

  • the day of the year

The index must be a pd.DatetimeIndex() The column names must match those defined via col_mapping

Parameters:

d (date) – the date for which to prepare data. This will usually be today

Return type:

DataFrame

hist_fig(fig_data)[source]

Creates a figures representing a histogram or more specifically a kernel density estimate. This includes lines for the 5% percentile and 95% percentile as well as the median. A red line for today’s value is added.

Parameters:

fig_data (PlotData) – a PlotData() object containing all data necessary for creating the plot

Return type:

Figure

Returns:

a Figure() object that can be used as a json on the page, defining the plot including all data

prepare_daily_and_calendar_data(d, current_avg=None)[source]

This get the daily data from the database and creates the calendar plot data. This is separated from _prepare_data() so it can be used via last_years_calendar()

Parameters:
  • d (date) – the date for which to prepare data. This will usually be today or in this case the first day of the year to prepare the calendar data for

  • current_avg (float | None) – This is used to add the current day which has no entry in the daily data just yet. When working with previous years, this should be left as None (default: None)

Return type:

tuple[DataFrame, DataFrame]

Returns:

a tuple of pd.DataFrame(): (daily, calendar_data)

prepare_data(d)[source]

The purpose of this function is to compile a isithot.blueprints.plots.PlotData() object which is used for the creation of all plots.

Parameters:

d (date) – the date for which to prepare data. This will usually be today

Return type:

PlotData

Returns:

the data needed for creating the plots and texts all contained in a isithot.blueprints.plots.PlotData() object

class isithot.blueprints.plots.PlotData(current_date: date, daily: pd.DataFrame, now: pd.DataFrame, toy_data: pd.DataFrame, trend_overall_data: pd.DataFrame, trend_month_data: pd.DataFrame, calendar_data: pd.DataFrame, trend_overall_slope: float, trend_overall_intercept: float, trend_month_slope: float, trend_month_intercept: float, current_avg: float, current_avg_percentile: float, q5: float, median: float, q95: float)[source]
Parameters:
  • current_date – The date for which the data is compiled. This is usually today

  • daily – A pandas dataframe containing all daily data that is available in the database

  • now – The latest data from the station (high resolution raw data)

  • toy_data – Data for the current time of year (toy). For this a week before current_data and a week after current_date is extracted

  • trend_overall_data – (Yearly) data needed to calculate the overall trend since the start of the measurements

  • trend_month_data – Data needed for calculating the trend for the current month

  • calendar_data – Data needed to create a calendar plot for the current year

  • trend_overall_slope – The slope of the line for the overall warming trend across all years and times of year

  • trend_overall_intercept – The intercept of the line for the overall warming trend across all years and times of year

  • trend_month_slope – The slope of the line for the current warming trend across all years for the current time of year \(\pm\) 7 days

  • trend_month_intercept – The intercept of the line for the current warming trend across all years for the current time of year \(\pm\) 7 days

  • current_avg – The current average of today calculated from averaging the minimum and maximum temperature

  • current_avg_percentile – The percentile of current_avg

  • q5 – the 5% percentile for this time of the year

  • median – the median/50% percentile for this time of the year

  • q95 – the 95% percentile for this time of the year

property avg_compare: str

returns a more comprehensive sentence of yes/no

calendar_data: DataFrame

Alias for field number 6

current_avg: float

Alias for field number 11

current_avg_percentile: float

Alias for field number 12

current_date: date

Alias for field number 0

daily: DataFrame

Alias for field number 1

property hot_warm: str
median: float

Alias for field number 14

now: DataFrame

Alias for field number 2

q5: float

Alias for field number 13

q95: float

Alias for field number 15

toy_data: DataFrame

Alias for field number 3

trend_month_data: DataFrame

Alias for field number 5

trend_month_intercept: float

Alias for field number 10

trend_month_slope: float

Alias for field number 9

trend_overall_data: DataFrame

Alias for field number 4

trend_overall_intercept: float

Alias for field number 8

trend_overall_slope: float

Alias for field number 7

property yes_no: str

returns a yes/no equivalent depending on the percentile

Indices and tables