Over rolling windows you can update your posteriors by re-estimating the model every 50 observations (or however often you want to do it). This can easily be accomplished in a for loop.
Posterior samples, forecasts, and other quantities can get re-generate with each re-estimation of the model. At the end of the day it is just a lot of loops. You may also want to consider running the models in parallel (like with the foreach and doParallel libraries) to speed up computation time if that is an issue for you.
Unfortunately, I do not know of any R packages that provide functions for automatically computing rolling window forecasts in a Bayesian setting. For the most part I think you have to program it yourself. It is a little bit tedious figuring out how index and store everything but the good news is that it is not that hard to do.
Though I do not think you need it, I give more intuition below (which does not differ much from the intuition I gave in the previous post you referenced).
I Assume your test set consists of $r_1,r_2,r_3,...,r_t$ and that you want to generate forecasts for an out of sample window $r_{t+1},r_{t+2},...,r_{t+K}$.
In a Bayesian setting the forecasts are treated as random samples with their own joint posterior distribution. So instead of producing point forecasts, you will draw a sample from the forecast distribution;
$$P(r_{t+1},r_{t+2},...,r_{t+K}|r_1,r_2,...,r_t,\boldsymbol{\theta}_t)$$ where $\boldsymbol{\theta}_t$ is just my shorthand notation for a vector containing the model parameters.
You can sample form the above forecast distribution with a Gibbs sampler. If your posterior is made up of $G$ samples, then for each $g=1,2,3,...,G$ you can draw
$r^{(g)}_{t+1} \sim P(r_{t+1}|r_1,...,r_t,\boldsymbol{\theta}^{(g)}_t)$
then
$r^{(g)}_{t+2}\sim P(r_{t+2}|r_1,...,r_t,r^{(g)}_{t+1},\boldsymbol{\theta}^{(g)}_t)$
and keep going until you sample
$r^{(g)}_{t+K} \sim P(r_{t+K}|r_1,...,r_t,r^{(g)}_{t+1},r^{(g)}_{t+K-1},\boldsymbol{\theta}^{(g)}_t)$
for any $k=1,2,...,K$ the conditional posterior $P(r_{t+k}|r_1,...,r_t,r^{(g)}_{t+1},r^{(g)}_{t+k-1},\boldsymbol{\theta}^{(g)}_t)$ can be sampled from in the following manner
$$
\omega_{t+k}^{(g)} \sim IG\bigg(\frac{v^{(g)}}{2},\frac{v^{(g)}}{2}\bigg)\;\;\;\;\;\varepsilon_{t+k}^{(g)} \sim N(0,1)
$$
$$
r_{t+k}^{(g)} = \varepsilon_{t+k}^{(g)}\times \bigg(\frac{v^{(g)}-2}{v^{(g)}} \omega_{t+k}^{(g)}h_{t+k}^{(g)}\bigg)^{1/2}
$$
$h^{(g)}_{t+k+1}$, which is needed to sample $r^{(g)}_{t+k+1}$, can be calculated deterministically once $r^{(g)}_{t+k}$ is given as shown below
$$
h_{t+k+1}^{(g)} = \alpha_0^{(g)} + \alpha_1^{(g)}r_{t+k}^{(g)} + \beta^{(g)}h_{t+k}^{(g)}
$$
Also, I see that you simply fixed $h_0=0$. I do not know whether or not that is a good assumption. However, I do not know off the top of my head a better way of going about it.
Best Answer
I will attempt answering question 2. (Question 1 remains open.)
I checked an article about the "fGarch" package: Wurtz et al. "Parameter Estimation of ARMA Models with GARCH/APARCH Errors: An R and SPlus Software Implementation".
It seems that the initial value of the conditional variance, $\sigma^2_0$, is taken to be the unconditional variance of the sample (with $n$ rather than $n-1$ in the denominator) -- see p. 6 in the article cited above, particularly line 21:
h = filter(e, beta, "r", init = Mean)
. Here I supposeinit
stands for the initial value of the conditional variance, andMean
is the mean of squared errors as defined in line 18. I am not completely sure, though.Similarly, the initial value of the squared error, $\varepsilon_0^2$, seems to be the same as $\sigma^2_0$, i.e.
Mean
-- as can be seen in line 20:e = omega + alpha * c(Mean, z[-length(x)]^2)
.Interestingly, at the bottom of p. 5 the article discusses the inital value selection for all other parameters ($\mu = \bar x$, $\omega = 0.1 \cdot \widehat{\text{Var}}(x)$, $\alpha = 0.1$, $\beta = 0.8$) but not for $\sigma^2_0$. I wonder why they did not make it explicit for $\sigma^2_0$. Perhaps I am missing something.