Solved – Time series forecast with probability

bayesianprobabilityrtime series

I have historical data for a particular metric for each month for the last 3 years for different categories. The metric is a percentage and its heavily skewed towards 1 with more than 75% of values being above 0.9 but some values as low as 0.3

My idea was to create some form of time series forecast but one which I can simulate thousands of times to get the probability the metric for a month in the future might be higher than 0.95 for example.

I tried a linear model but that doesn't work at all

Best Answer

It seems that you are strugling with an adequate assumption about the distribution of the response variable. Classical linear regression and classical ARMA-models assume that the response variable, has support on all the real numbers $(-\infty, \infty)$. Often the response is also assumed to be normally distributed. This is clearly not the case in your application.

I would first try to disregard the (potential) time interdependence of the data and fit a Beta-regression. The Beta-regression is a Generalized Linear Model (GLM) assuming the response variable follows a Beta-distribution, when conditioning on co-variates. The Beta-distribution is a very flexible continuous distribution on the unit interval, $(0,1)$. This answer has some good references: Regression for an outcome (ratio or fraction) between 0 and 1.

If you find that there is significant serial correlation in your response variable that the co-variates cannot account for, I would look into Beta-ARMA models of Rocha & Cribari-Neto (2009) or Guolo and Cristiano Varin (2014). Guolo and Cristiano Varin (2014) is probabely the easiest one to get started with since they have a nice example in R where they fit a Beta-ARMA model to illness percentage over time.

Related Question