Welcome to isithot documentation!¶

Installation¶

via https

pip install git+https://github.com/RUBclim/isithot

via ssh

pip install git+ssh://git@github.com/RUBclim/isithot

Quick start¶

An initial app can be create quite simple by adding a single data provider.

Adding data providers¶

Add a new isithot.DataProvider instance. The DataProvider.get_current_data() and DataProvider.get_daily_data() methods need to be implemented.
Create a isithot.ColumnMapping instance which maps the columns of your data source to the columns the package expects.
Create a dictionary of isithot.DataProvider where the key must match the id.
register the data providers with the current app app.config['DATA_PROVIDERS'] = data_providers.

from datetime import date

import pandas as pd
from flask import Flask
from isithot import ColumnMapping
from isithot import create_app
from isithot import DataProvider
from isithot.config import Config


class TestProvider(DataProvider):
    def get_current_data(self, d: date) -> pd.DataFrame:
        df = pd.DataFrame({
            'date': [pd.Timestamp(d)],
            'temp_max': [30.0],
            'temp_min': [20.0],
        })
        return df.set_index('date')

    def get_daily_data(self, d: date) -> pd.DataFrame:
        x = pd.read_csv(
            'testing/monthly_input_data/lmss_daily_long.csv',
            parse_dates=['date'],
            index_col='date',
        )
        x['doy'] = x.index.dayofyear
        return x


def my_app() -> Flask:
    col_map = ColumnMapping(
        datetime='date',
        temp_mean='temp_mean_mannheim',
        temp_max='temp_max',
        temp_min='temp_min',
        day_of_year='doy',
    )

    data_providers = {
        'test': TestProvider(
            col_mapping=col_map,
            name='Test',
            id='test',
            min_year=2010,
        ),
    }

    app = create_app(Config)
    app.config['DATA_PROVIDERS'] = data_providers
    return app


if __name__ == '__main__':
    app = my_app()
    app.run(debug=True)

implementing caching¶

The isithot app comes with caches that can be added to a function. E.g. the daily data will likely not changes very often, hence we can cache it for e.g. one hour.

from isithot.cache import cache

class TestProvider(DataProvider):
    @cache.cached(timeout=60*60, key_prefix='daily_data')
    def get_daily_data(self, d: date) -> pd.DataFrame:
        ...

more complex data retrieval¶

An example for a more complex example can be found in testing/example_app.py which uses database queries. All implementations need to consider performance since this is executed during handling of the http request.

Another option for data retrieval is the server performing and API request e.g.

    def get_current_data(self, d: date) -> DataFrame:
        """
        fetch the latest weather data from the DWD. ``self.id`` corresponds to the
        station ID by DWD which is set during DataProvider creation.
        """
        ret = urllib.request.urlopen(
            f'https://dwd.api.proxy.bund.dev/v30/stationOverviewExtended?stationIds={self.id}',
            timeout=3,
        )
        data = current_app.json.loads(ret.read())
        temp_min = data[self.id]['days'][0]['temperatureMin'] / 10
        temp_max = data[self.id]['days'][0]['temperatureMax'] / 10
        date = datetime.strptime(
            data[self.id]['days'][0]['dayDate'], '%Y-%m-%d',
        )
        return pd.DataFrame(
            {
                self.col_mapping.temp_min: temp_min,
                self.col_mapping.temp_max: temp_max,
            },
            index=pd.DatetimeIndex([date], name=self.col_mapping.datetime),
        )

API-Documentation¶

i18n¶

This web-app uses internationalization (i18n) to also have this page available in german, since the audience will mostly be german. This is setup via Babel and all english text (both, in .py and .html files) is wrapped in _(...) a function. This can be extracted automatically via:

pybabel extract -F babel.cfg -o isithot/translations/messages.pot .

This will generate a messages.pot file which is the basis for all translations. Based on this a translation can be initialized with this command. In this case this is for German (de).

pybabel init -i isithot/translations/messages.pot -d isithot/translations/ -l de

This will now create a subfolder for the specific language (in this case de for German). The messages.pot can now be used to translate all messages.

Finally, the languages have to be compiled into a messages.mo file. This needs to be done manually for testing. It is done automatically for production while building the docker image.

pybabel compile -d isithot/translations

Important

If there are changes made to any of the strings (in the .py or .html file that are wrapped in a _(...) function) the .pot file needs to be updated using these commands:

pybabel extract -F babel.cfg -o isithot/translations/messages.pot .

pybabel update -i isithot/translations/messages.pot -d isithot/translations

`app`¶

isithot.app.create_app(config)[source]¶

create and configure the isithot Flask application.

Parameters:: config (object) – Configuration object to use for the Flask app.
Return type:: Flask
Returns:: Configured Flask application instance.

`blueprints`¶

isithot.blueprints.isithot.get_locale()[source]¶

utility for getting the lang from the Language-Accept header

Return type:: str | None
Returns:: the language key - either de or en

isithot.blueprints.isithot.index()[source]¶

A simple route to have nicer link to share.

Return type:: Response

isithot.blueprints.isithot.last_years_calendar(station, year)[source]¶

Returns the calendar figure data as json for the specified year.

This route is cached indefinitely and does not take the locale into account, since it’s only static data.

Parameters:

station (str) – The station a plot is created for.
year (int) – The year a plot is created for.

Return type:

str

isithot.blueprints.isithot.plots(station)[source]¶

Renders the isithot page with all plots.

This route is cached since compiling the data and generating the plots is quite expensive. The cache expires after 5 minutes hence it is still almost live data.

Parameters:: station (str) – The station a plot is created for.
Return type:: str

class isithot.blueprints.plots.ColumnMapping(datetime: str, temp_mean: str, temp_max: str, temp_min: str, day_of_year: str)[source]¶

Class for defining the columns mapping the different parameters needed

Parameters:

datetime – the column name of the column that stores the date (and maybe time) information
temp_mean – the column name of the column that stores the average air-temperature information
temp_max – the column name of the column that stores the maximum air-temperature information
temp_min – the column name of the column that stores the minimum air-temperature information
day_of_year – the column name of the column that stores the day of year number

datetime: str¶: Alias for field number 0

day_of_year: str¶: Alias for field number 4

temp_max: str¶: Alias for field number 2

temp_mean: str¶: Alias for field number 1

temp_min: str¶: Alias for field number 3

class isithot.blueprints.plots.DataProvider(col_mapping, name, id, min_year)[source]¶

Base Class for defining a custom data provider. get_daily_data() and get_current_data() need to be overridden.

Parameters:

col_mapping (ColumnMapping) – a ColumnMapping() mapping the column names returned by get_daily_data() or get_current_data() to variables so they can be used later
name (str) – the name of the station that is displayed on the website
id (str) – the ID of the station that is used for compiling links. If multiple DataProviders are used, each one must have a unique station_id.
min_year (int) – the minimum year for which data is available. This is used to determine the first year for which a calendar plot is created.

calendar_fig(calendar_data)[source]¶

Creates a figures representing a calendar plot of the current year indicating the percentile of each day as a color and a number.

Parameters:: calendar_data (DataFrame) – a pd.DataFrame() containing all data necessary for creating the plot
Return type:: Figure
Returns:: a Figure() object that can be used as a json on the page, defining the plot including all data

distrib_fig(fig_data)[source]¶

Creates a figures representing the distribution with 5% and 95% percentile and the trends for the time of year and the overall warming trend.

Parameters:: fig_data (PlotData) – a PlotData() object containing all data necessary for creating the plot
Return type:: Figure
Returns:: a Figure() object that can be used as a json on the page, defining the plot including all data

get_current_data(d)[source]¶

This needs to be implemented and most likely be a database query or a file that is read. It might makes sense to cache this function. d may be used as a cache-key.

This should return a pd.DataFrame() with columns containing:

date (as a datetime object)
maximum temperature
minimum temperature

The index must be a pd.DatetimeIndex() The column names must match those defined via col_mapping

Parameters:: d (date) – the date for which to prepare data. This will usually be today
Return type:: DataFrame

get_daily_data(d)[source]¶

This needs to be implemented and most likely be a database query or a file that is read. It might makes sense to cache this function. d may be used as a cache-key.

This should return a pd.DataFrame() with columns containing:

date a datetime object
mean temperature
the day of the year

The index must be a pd.DatetimeIndex() The column names must match those defined via col_mapping

Parameters:: d (date) – the date for which to prepare data. This will usually be today
Return type:: DataFrame

hist_fig(fig_data)[source]¶

Creates a figures representing a histogram or more specifically a kernel density estimate. This includes lines for the 5% percentile and 95% percentile as well as the median. A red line for today’s value is added.

Parameters:: fig_data (PlotData) – a PlotData() object containing all data necessary for creating the plot
Return type:: Figure
Returns:: a Figure() object that can be used as a json on the page, defining the plot including all data

prepare_daily_and_calendar_data(d, current_avg=None)[source]¶

This get the daily data from the database and creates the calendar plot data. This is separated from _prepare_data() so it can be used via last_years_calendar()

Parameters:

d (date) – the date for which to prepare data. This will usually be today or in this case the first day of the year to prepare the calendar data for
current_avg (float | None) – This is used to add the current day which has no entry in the daily data just yet. When working with previous years, this should be left as None (default: None)

Return type:

tuple[DataFrame, DataFrame]

Returns:

a tuple of pd.DataFrame(): (daily, calendar_data)

prepare_data(d)[source]¶

The purpose of this function is to compile a isithot.blueprints.plots.PlotData() object which is used for the creation of all plots.

Parameters:: d (date) – the date for which to prepare data. This will usually be today
Return type:: PlotData
Returns:: the data needed for creating the plots and texts all contained in a isithot.blueprints.plots.PlotData() object

class isithot.blueprints.plots.PlotData(current_date: date, daily: pd.DataFrame, now: pd.DataFrame, toy_data: pd.DataFrame, trend_overall_data: pd.DataFrame, trend_month_data: pd.DataFrame, calendar_data: pd.DataFrame, trend_overall_slope: float, trend_overall_intercept: float, trend_month_slope: float, trend_month_intercept: float, current_avg: float, current_avg_percentile: float, q5: float, median: float, q95: float)[source]¶

Parameters:

current_date – The date for which the data is compiled. This is usually today
daily – A pandas dataframe containing all daily data that is available in the database
now – The latest data from the station (high resolution raw data)
toy_data – Data for the current time of year (toy). For this a week before current_data and a week after current_date is extracted
trend_overall_data – (Yearly) data needed to calculate the overall trend since the start of the measurements
trend_month_data – Data needed for calculating the trend for the current month
calendar_data – Data needed to create a calendar plot for the current year
trend_overall_slope – The slope of the line for the overall warming trend across all years and times of year
trend_overall_intercept – The intercept of the line for the overall warming trend across all years and times of year
trend_month_slope – The slope of the line for the current warming trend across all years for the current time of year \(\pm\) 7 days
trend_month_intercept – The intercept of the line for the current warming trend across all years for the current time of year \(\pm\) 7 days
current_avg – The current average of today calculated from averaging the minimum and maximum temperature
current_avg_percentile – The percentile of current_avg
q5 – the 5% percentile for this time of the year
median – the median/50% percentile for this time of the year
q95 – the 95% percentile for this time of the year

property avg_compare: str¶: returns a more comprehensive sentence of yes/no

calendar_data: DataFrame¶: Alias for field number 6

current_avg: float¶: Alias for field number 11

current_avg_percentile: float¶: Alias for field number 12

current_date: date¶: Alias for field number 0

daily: DataFrame¶: Alias for field number 1

property hot_warm: str¶

median: float¶: Alias for field number 14

now: DataFrame¶: Alias for field number 2

q5: float¶: Alias for field number 13

q95: float¶: Alias for field number 15

toy_data: DataFrame¶: Alias for field number 3

trend_month_data: DataFrame¶: Alias for field number 5

trend_month_intercept: float¶: Alias for field number 10

trend_month_slope: float¶: Alias for field number 9

trend_overall_data: DataFrame¶: Alias for field number 4

trend_overall_intercept: float¶: Alias for field number 8

trend_overall_slope: float¶: Alias for field number 7

property yes_no: str¶: returns a yes/no equivalent depending on the percentile

Welcome to isithot documentation!¶

Installation¶

Quick start¶

Adding data providers¶

implementing caching¶

more complex data retrieval¶

API-Documentation¶

i18n¶

app¶

blueprints¶

Indices and tables¶

`app`¶

`blueprints`¶