Bayesian Forecasting – Evaluating Probabilistic/Bayesian Forecasts: PIT Values & Generation Methods

bayesianforecastingmachine learningmodel-evaluationpit

Suppose you are modelling a linear regression $y_i = \alpha + \beta x_i + \epsilon_i$, in probabilistic terms:
$$ \mu_i = \alpha + \beta x_i, $$
$$ y_i \sim \mathcal{N}(\mu_i, \sigma). $$

For each new data point $x_{new}$, when making predictions, you obtain a distribution over possible target variable values. As true density is not observed, we need a method to evaluate this against observed data.

My understanding is that the PIT (Probability Integral Transform) values can be such a method, and can be used to test whether the predictive densities are correctly specified i.e. the 95% forecast interval should contain the future value of the series in 95% of repeated applications.

It seems that once PIT values are obtained, if the predicted forecasts are correctly specified, then the PITs should be more or less uniformly distributed.

However, is it not clear to how these PIT values are calculated and I am yet to find an example of how to do so…

E.g. suppose you have the following output (i.e. observed values of target variable and samples from your posterior predictive (small sample size just for example purpose)), how would you calculate PIT values in this case:

Observed_Data     Predictive_Distribution_Samples 
10                [5, 6, 9, 9, 9, 10, 10, 11, 11, 14] 
20                [15, 17, 19, 19, 19, 20, 21, 21, 21, 25]
30                [26, 26, 28, 29, 29, 30, 31, 33, 34, 37]

Best Answer

If your predictive distribution for variable $X$ has a CDF $F_X$, then the PIT of a realized value $x_0$ is $p_0=F_X(x_0)$. E.g. if your predictive distribution is $N(\mu,\sigma^2)$, then R code for obtaining the PIT is p_0=pnorm(x_0,mean=mu,sd=sigma). If $F_X$ is not available analytically but you can sample from $X$, then to obtain $p_0$ you need to calculate the empirical quantile level of $x_0$ within a large sample from $X$. In R it would be something like p_0=rank(c(x_0,xs))[1]/(length(xs)+1) where xs is a large sample from $X$. (Not necessarily the best approximation, but should do.)

With your data I get PITs to be 0.636, 0.591 and 0.591:

x_0 = 10; xs = c(5, 6, 9, 9, 9, 10, 10, 11, 11, 14)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)
x_0 = 20; xs = c(15, 17, 19, 19, 19, 20, 21, 21, 21, 25)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)
x_0 = 30; xs = c(26, 26, 28, 29, 29, 30, 31, 33, 34, 37)
p_0 = rank(c(x_0,xs))[1]/(length(xs)+1); print(p_0)
Related Question