I'm dealing with sales forecast for products in same category (f.i. all products belonging to $wine$ category). Data are time series with a daily frequency, $Y_{product_i} \in {0,1,2,..}$ . After some inspection I found out that time series models like Arima or Arimax do not fit particularly well my data and so I would like to try a new way.
I was considering, having count data, a Poisson Regression Model, which for a single product would look like this:
$Y_t|F_{t-1} \sim Poisson(\lambda_{t})$
$\lambda_t = \beta_0 + \beta_1*Y_{t-1} + \alpha_1*\lambda_{t-1} + \eta*X_t$
where $Y_t $ = sales at day t, $X_t$ = Exogenous regressors at day t.
My first question is:
Is there any package in $R$ implementing this? Is $glm$ sufficient?
Furthermore some products have very sparse time series while others sell always at least a couple of decades of items: would it be proper to consider log linear models in the second case and Hurdle models in the first one?
I was then considering the fact that I probably will need a more complex model which models a covariance structure between the sales of different products.
Unfortunately I don't have much theoretical and Rpackages background in this case. I'm thinking about mixed effects models but I'm just wondering.. Basically I would like a Poisson Regression model with an autoregressive term for each of my products, but I don't know how to insert a covariance structure over the products. Should I go Bayesian?
$Y_{i_t} \sim Poisson(\lambda_{i_t})$
$\lambda_{i_t} = \beta_{0_i} + \beta_{i_1}*Y_{i_{t-1}} +\alpha_{1_i}*\lambda_{i_{t-1}} + \eta_i*X_{i_t}$
$i=1,..,numberofproducts$
So my second question is about finding theoretical perspective about multivariate time series count data together with packages in R.
Best Answer
I found a reference which answered my question: https://arxiv.org/pdf/1405.3738.pdf.
The model is quite complicated, here is the state space representation:
So, let's say I have L different products I'm studying across 1,..,T time periods.
$Y_{l,t} \sim z*\delta_0 + (1-z)NB(exp(\widetilde{\eta}_{l,t}),alpha_l)$ is the distribution for product l at time t
$\widetilde{\eta}_{l,t} = \eta_{l,t} + X_{l,t}\theta_l$ this is the Log of the mean of product l sales at time t, guaranteeing that it is positive.
$\eta_{l,t} = \mu_l + \phi_l(\eta_{l,t-1}-\mu) + \epsilon_{l,t}$
$\epsilon_{l,t} \sim N(0,\frac{1}{\tau_l})$
The other priors and hyperpriors are in the next images:
P.S. Now I'm trying to write the JAGS code and any help would be much appreciated! ( https://stackoverflow.com/questions/40528715/runtime-error-in-jags )
Edit:
Here is the JAGS code:
Which I call from R using runjags: