Title: | Basic tools for scoring hubverse forecasts |
---|---|
Description: | Using functionality from the scoringutils package, this software provides basic tools for scoring hubverse forecasts. |
Authors: | Nicholas Reich [aut, cre] , Evan Ray [aut], Nikos Bosse [aut] , Matthew Cornell [aut], Zhian Kamvar [ctb] , Becky Sweger [aut], Kimberlyn Roosa [aut] |
Maintainer: | Nicholas Reich <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.0.0.9001 |
Built: | 2024-11-20 14:30:47 UTC |
Source: | https://github.com/hubverse-org/hubEvals |
Scores model outputs with a single output_type
against observed data.
score_model_out( model_out_tbl, target_observations, metrics = NULL, summarize = TRUE, by = "model_id", output_type_id_order = NULL )
score_model_out( model_out_tbl, target_observations, metrics = NULL, summarize = TRUE, by = "model_id", output_type_id_order = NULL )
model_out_tbl |
Model output tibble with predictions |
target_observations |
Observed 'ground truth' data to be compared to predictions |
metrics |
Character vector of scoring metrics to compute. If |
summarize |
Boolean indicator of whether summaries of forecast scores
should be computed. Defaults to |
by |
Character vector naming columns to summarize by. For example,
specifying |
output_type_id_order |
For ordinal variables in pmf format, this is a vector of levels for pmf forecasts, in increasing order of the levels. For all other output types, this is ignored. |
Default metrics are provided by the scoringutils
package. You can select
metrics by passing in a character vector of metric names to the metrics
argument.
The following metrics can be selected (all are used by default) for the
different output_type
s:
Quantile forecasts: (output_type == "quantile"
)
wis
overprediction
underprediction
dispersion
bias
interval_coverage_deviation
ae_median
"interval_coverage_XX": interval coverage at the "XX" level. For example, "interval_coverage_95" is the 95% interval coverage rate, which would be calculated based on quantiles at the probability levels 0.025 and 0.975.
See scoringutils::get_metrics.forecast_quantile for details.
Nominal forecasts: (output_type == "pmf"
and output_type_id_order
is NULL
)
log_score
(scoring for ordinal forecasts will be added in the future).
See scoringutils::get_metrics.forecast_nominal for details.
Median forecasts: (output_type == "median"
)
ae_point: absolute error of the point forecast (recommended for the median, see Gneiting (2011))
See scoringutils::get_metrics.forecast_point for details.
Mean forecasts: (output_type == "mean"
)
se_point
: squared error of the point forecast (recommended for the mean, see Gneiting (2011))
A data.table with scores
Making and Evaluating Point Forecasts, Gneiting, Tilmann, 2011, Journal of the American Statistical Association.
# compute WIS and interval coverage rates at 80% and 90% levels based on # quantile forecasts, summarized by the mean score for each model quantile_scores <- score_model_out( model_out_tbl = hubExamples::forecast_outputs |> dplyr::filter(.data[["output_type"]] == "quantile"), target_observations = hubExamples::forecast_target_observations, metrics = c("wis", "interval_coverage_80", "interval_coverage_90"), by = "model_id" ) quantile_scores # compute log scores based on pmf predictions for categorical targets, # summarized by the mean score for each combination of model and location. # Note: if the model_out_tbl had forecasts for multiple targets using a # pmf output_type with different bins, it would be necessary to score the # predictions for those targets separately. pmf_scores <- score_model_out( model_out_tbl = hubExamples::forecast_outputs |> dplyr::filter(.data[["output_type"]] == "pmf"), target_observations = hubExamples::forecast_target_observations, metrics = "log_score", by = c("model_id", "location", "horizon") ) head(pmf_scores)
# compute WIS and interval coverage rates at 80% and 90% levels based on # quantile forecasts, summarized by the mean score for each model quantile_scores <- score_model_out( model_out_tbl = hubExamples::forecast_outputs |> dplyr::filter(.data[["output_type"]] == "quantile"), target_observations = hubExamples::forecast_target_observations, metrics = c("wis", "interval_coverage_80", "interval_coverage_90"), by = "model_id" ) quantile_scores # compute log scores based on pmf predictions for categorical targets, # summarized by the mean score for each combination of model and location. # Note: if the model_out_tbl had forecasts for multiple targets using a # pmf output_type with different bins, it would be necessary to score the # predictions for those targets separately. pmf_scores <- score_model_out( model_out_tbl = hubExamples::forecast_outputs |> dplyr::filter(.data[["output_type"]] == "pmf"), target_observations = hubExamples::forecast_target_observations, metrics = "log_score", by = c("model_id", "location", "horizon") ) head(pmf_scores)
Transform pmf model output into a forecast object
transform_pmf_model_out( model_out_tbl, target_observations, output_type_id_order = NULL )
transform_pmf_model_out( model_out_tbl, target_observations, output_type_id_order = NULL )
model_out_tbl |
Model output tibble with predictions |
target_observations |
Observed 'ground truth' data to be compared against predictions |
output_type_id_order |
For nominal variables, this should be |
forecast_quantile
Transform either mean or median model output into a point forecast object:
transform_point_model_out(model_out_tbl, target_observations, output_type)
transform_point_model_out(model_out_tbl, target_observations, output_type)
model_out_tbl |
Model output tibble with predictions |
target_observations |
Observed 'ground truth' data to be compared against predictions |
output_type |
Forecast output type: "mean" or "median" |
This function transforms a model output tibble in the Hubverse format (with either "mean" or "median" output type) to a scoringutils "point" forecast object
forecast_point
Transform quantile model output into a forecast object
transform_quantile_model_out(model_out_tbl, target_observations)
transform_quantile_model_out(model_out_tbl, target_observations)
model_out_tbl |
Model output tibble with predictions |
target_observations |
Observed 'ground truth' data to be compared against predictions |
forecast_quantile