Assessed Ranges API

Handling of assessed ranges

class pyrcmip.assessed_ranges.AssessedRanges(db)

Bases: object

Class for handling assessed ranges and performing operations with them.

For example, getting values for specific metrics and plotting results against assessed ranges.

assessed_range_label = 'assessed range'

String used for labelling assessed ranges (in plots, dataframes etc.)

Type

str

calculate_metric_from_results(metric, res_calc, custom_calculators=None)

Calculate metric values from results

Parameters
  • metric (str) – Metric for which to calculate results

  • res_calc (scmdata.ScmRun) – Results to use for the calculation

  • custom_calculators (tuple(pyrcmip.metric_calculations.base.Calculator)) – Custom calculators to use for calculating metrics which require a custom calculation

Returns

pd.DataFrame containing the calculated metric values alongside other relevant metadata

Return type

pd.DataFrame

Raises

ValueError – Data required to calculate the metric is not available

check_norm_period_evaluation_period_against_data(norm_period, evaluation_period, data)

Check the normalisation and evaluation periods against the data

Parameters
  • norm_period (None or range(int, int)) – Normalisation period to check. If None, no check is performed.

  • evaluation_period (None or range(int, int)) – Evaluation period to check. If None, no check is performed.

  • data (scmdata.ScmRun) – Data to check

Raises

ValueError – The data is incompatible with the periods (e.g. the normalisation period begins before the data begins).

get_assessed_range_for_boxplot(metric, n_to_draw=20000)

Get assessed range for a box plot

This converts the assessed range from IPCC language (very likely, likely, central) into a distribution of values, based on pyrcmip.stats.get_skewed_normal().

Parameters
  • metric (str) – Metric for which to get assessed range distribution

  • n_to_draw (int) – Number of points to include in the returned distribution

Returns

pd.DataFrame with n_to_draw rows, each of which contains a drawn value for metric. The returned values are put in a column whose name is equal to the value of metric. We also return a "unit" column and a "Source" column. The "Source" column is filled with self.assessed_range_label. Note that if the central value is nan, the entire distribution will simply be filled with nan.

Return type

pd.DataFrame

get_col_for_metric(metric, col)

Get value of column for a given metric (i.e. RCMIP name)

Parameters
  • metric (str) – Metric whose values we want to look up

  • col (str) – Column whose values we want (e.g. “RCMIP scenario”)

Returns

The value in the column

Return type

str

Raises
  • ValueError – The metric could not be found in self.db

  • KeyError – The column could not be found in self.db

get_col_for_metric_list(metric, col, delimeter=',')

Get value of column for a given metric (i.e. RCMIP name), split using a delimeter

Parameters
  • metric (str) – Metric whose values we want to look up

  • col (str) – Column whose values we want (e.g. “RCMIP scenario”)

  • delimeter (str) – Delimeter used to split col’s values

Returns

List of values, derived by splitting

Return type

list

Raises

TypeError – The found values are not a string (i.e. cannot be split by a delimiter)

get_norm_period_evaluation_period(metric)

Get normalisation and evaluation period for a given metric

Parameters

metric (str) – Metric for which to get normalisation and evaluation periods

Returns

Normalisation period and evaluation period. Each return value is a range of years which define the relevant period. If there is no period supplied, None is returned. For example, if the evaluation period is 1961-1990 and there is no reference period, then None, range(1961, 1990 + 1) is returned.

Return type

norm_period, evaluation_period

Raises

ValueError – A period could not be resolved because it is ambiguous i.e. it has nan for the start/end of the period while the other value is not nan.

get_results_summary_table_for_metric(metric, model_results)

Get results summary table for a given metric

Parameters
  • metric (str) – Metric for which to get the summary table

  • model_results (pd.DataFrame) – pd.DataFrame containing the model results. It must have at least the following columns: "climate_model", "value".

Returns

pd.DataFrame containing a summary of the results. The percentage difference is calculated as (model_value - assessed_value) / np.abs(assessed_value) * 100.

Return type

pd.DataFrame

get_variables_regions_scenarios_for_metric(metric, single_value=True)

Get variables, regions and scenarios required to calculate a given metric

Parameters

metric (str) – Metric for which to get values

Returns

Dictionary containing required variables, regions and scenarios

Return type

dict

head(n=5)

Get head of self.db

Parameters

n (int) – Number of rows to return

Returns

Head of self.db

Return type

pd.DataFrame

metric_column = 'RCMIP name'

Name of the column which holds the names of the metrics being assessed

Type

str

plot_against_results(results_database, climate_models=['*'], custom_calculators=None, palette=None)

Calculate metric values from results, compare and plot against assessed ranges

Parameters
  • metric (str) – Metric for which to calculate results

  • results_database (pyrcmip.database.DataBase) – Database from which to load results

  • climate_models (list[str]) – Climate models to calculate results for

  • custom_calculators (tuple(pyrcmip.metric_calculations.base.Calculator)) – Custom calculators to use for calculating metrics which require a custom calculation

  • palette (dict[str, str]) – Colours to use for the different climate models and assessed ranges when plotting

Returns

pd.DataFrame containing a dataframe based on concatenating the results from calling get_results_summary_table_for_metric() for each metric.

Return type

pd.DataFrame

plot_metric_and_results(metric, model_results, axes=None, palette=None)

Plot our parameterisation of the metric’s distribution and the model results

This produces a two-panel plot, the top panel has the distributions, the bottom panel has box and whisker plots (with the boxes and whiskers adjusted to match the IPCC calibrated likelihood language).

Parameters
  • metric (str) – Metric to plot

  • model_results (pd.DataFrame) – pd.DataFrame with the model results. Should be of the form returned by calculate_metric_from_results().

  • axes ((matplotlib.axes.SubplotBase, matplotlib.axes.SubplotBase)) – Axes on which to make the plots. Must be two-panels.

  • palette (dict[str, str]) – Colours to use for the different climate models and assessed ranges

Returns

Axes on which the plot was made

Return type

(matplotlib.axes.SubplotBase, matplotlib.axes.SubplotBase)

Raises

AssertionErroraxes doesn’t have a length equal to two

plot_metric_and_results_box_only(metric, model_results, ax=None, palette=None)

Plot box and whisker plots of the metric’s distribution and the model results

The box and whisker plots have the boxes and whiskers adjusted to match the IPCC calibrated likelihood language).

Parameters
  • metric (str) – Metric to plot

  • model_results (pd.DataFrame) – pd.DataFrame with the model results. Should be of the form returned by calculate_metric_from_results().

  • axes (matplotlib.axes.SubplotBase) – Axis on which to make the plot

  • palette (dict[str, str]) – Colours to use for the different climate models and assessed ranges

Returns

Axes on which the plot was made

Return type

matplotlib.axes.SubplotBase

plot_model_reported_against_assessed_ranges(model_reported, palette=None)

Compare and plot model reported results against assessed ranges

Parameters
  • model_reported (pd.DataFrame) – pd.DataFrame of the same format as the result of calculate_metric_from_results()

  • palette (dict[str, str]) – Colours to use for the different climate models and assessed ranges when plotting

Returns

pd.DataFrame containing a dataframe based on concatenating the results from calling get_results_summary_table_for_metric() for each metric

Return type

pd.DataFrame

tail(n=5)

Get tail of self.db

Parameters

n (int) – Number of rows to return

Returns

Tail of self.db

Return type

pd.DataFrame