Skip to contents

Given a forecast and a set of observations, compute the bias of the forecast's predictions.

Usage

bias(fcst, obs, summarize = TRUE)

Arguments

fcst

A forecast object (see output of create_forecast()).

obs

An observations data frame.

summarize

A boolean, defaults to TRUE. If TRUE, a single number will be returned as the score for the forecast. If FALSE, a data frame with columns named time, val_obs, and score will be returned, containing the scores for each individual time point. This can be used by plotting functions to colour-code observations, for example.

Value

A number between -1 and 1, inclusive. -1 means 100% underprediction and 1 means 100% overprediction.

Details

bias() looks for forecast data in the following order:

  1. raw data (val)

  2. mean (val_mean)

  3. median (val_q50)

It uses the first that it finds to calculate bias, by assigning 1 for overprediction, 0 for equality, and -1 for underprediction. It then summarizes the score by taking the mean of the assigned values.

Grouping

If summarize=FALSE is passed, the resulting scores will be grouped by time. If group columns are present, the data will be grouped by the group columns before scoring. In either case, the return value will instead be a data frame with columns for the time, observation, score, and group columns if they exist. See vignette("casteval") for more details.

Examples

obs <- data.frame(time=1:5, val_obs=rep(10,5))

# # a forecast with bias on individual days, but no overall bias
fc1 <- create_forecast(dplyr::tibble(
  time=c(1,1,2,2,3,3,4,4,5,5),
  val=c(9, 9, 9, 10, 10, 10, 10, 11, 11, 11)
))

bias(fc1, obs, summarize=FALSE)
#> # A tibble: 5 × 3
#>    time val_obs score
#>   <dbl>   <dbl> <dbl>
#> 1     1      10  -1  
#> 2     2      10  -0.5
#> 3     3      10   0  
#> 4     4      10   0.5
#> 5     5      10   1  

bias(fc1, obs)
#> [1] 0

# A forecast with an underprediction bias
fc2 <- create_forecast(data.frame(
  time=c(1,1,1,2,2,2,3,3,3),
  val=c(9,9,9,10,10,10,11,9,9)
))

bias(fc2, obs, summarize=FALSE)
#> # A tibble: 3 × 3
#>    time val_obs  score
#>   <dbl>   <dbl>  <dbl>
#> 1     1      10 -1    
#> 2     2      10  0    
#> 3     3      10 -0.333

bias(fc2, obs)
#> [1] -0.4444444