Over rolling windows you can update your posteriors by re-estimating the model every 50 observations (or however often you want to do it). This can easily be accomplished in a for loop.
Posterior samples, forecasts, and other quantities can get re-generate with each re-estimation of the model. At the end of the day it is just a lot of loops. You may also want to consider running the models in parallel (like with the foreach and doParallel libraries) to speed up computation time if that is an issue for you.
Unfortunately, I do not know of any R packages that provide functions for automatically computing rolling window forecasts in a Bayesian setting. For the most part I think you have to program it yourself. It is a little bit tedious figuring out how index and store everything but the good news is that it is not that hard to do.
Though I do not think you need it, I give more intuition below (which does not differ much from the intuition I gave in the previous post you referenced).
I Assume your test set consists of $r_1,r_2,r_3,...,r_t$ and that you want to generate forecasts for an out of sample window $r_{t+1},r_{t+2},...,r_{t+K}$.
In a Bayesian setting the forecasts are treated as random samples with their own joint posterior distribution. So instead of producing point forecasts, you will draw a sample from the forecast distribution;
$$P(r_{t+1},r_{t+2},...,r_{t+K}|r_1,r_2,...,r_t,\boldsymbol{\theta}_t)$$ where $\boldsymbol{\theta}_t$ is just my shorthand notation for a vector containing the model parameters.
You can sample form the above forecast distribution with a Gibbs sampler. If your posterior is made up of $G$ samples, then for each $g=1,2,3,...,G$ you can draw
$r^{(g)}_{t+1} \sim P(r_{t+1}|r_1,...,r_t,\boldsymbol{\theta}^{(g)}_t)$
then
$r^{(g)}_{t+2}\sim P(r_{t+2}|r_1,...,r_t,r^{(g)}_{t+1},\boldsymbol{\theta}^{(g)}_t)$
and keep going until you sample
$r^{(g)}_{t+K} \sim P(r_{t+K}|r_1,...,r_t,r^{(g)}_{t+1},r^{(g)}_{t+K-1},\boldsymbol{\theta}^{(g)}_t)$
for any $k=1,2,...,K$ the conditional posterior $P(r_{t+k}|r_1,...,r_t,r^{(g)}_{t+1},r^{(g)}_{t+k-1},\boldsymbol{\theta}^{(g)}_t)$ can be sampled from in the following manner
$$
\omega_{t+k}^{(g)} \sim IG\bigg(\frac{v^{(g)}}{2},\frac{v^{(g)}}{2}\bigg)\;\;\;\;\;\varepsilon_{t+k}^{(g)} \sim N(0,1)
$$
$$
r_{t+k}^{(g)} = \varepsilon_{t+k}^{(g)}\times \bigg(\frac{v^{(g)}-2}{v^{(g)}} \omega_{t+k}^{(g)}h_{t+k}^{(g)}\bigg)^{1/2}
$$
$h^{(g)}_{t+k+1}$, which is needed to sample $r^{(g)}_{t+k+1}$, can be calculated deterministically once $r^{(g)}_{t+k}$ is given as shown below
$$
h_{t+k+1}^{(g)} = \alpha_0^{(g)} + \alpha_1^{(g)}r_{t+k}^{(g)} + \beta^{(g)}h_{t+k}^{(g)}
$$
Also, I see that you simply fixed $h_0=0$. I do not know whether or not that is a good assumption. However, I do not know off the top of my head a better way of going about it.
Best Answer
A GARCH(1,1) model is \begin{aligned} y_t &= \mu_t + u_t, \\ \mu_t &= \dots \text{(e.g. a constant or an ARMA equation without the term $u_t$)}, \\ u_t &= \sigma_t \varepsilon_t, \\ \sigma_t^2 &= \omega + \alpha_1 u_{t-1}^2 + \beta_1 \sigma_{t-1}^2, \\ \varepsilon_t &\sim i.i.d(0,1). \\ \end{aligned} The three components in the conditional variance equation you refer to are $\omega$, $u_{t-1}^2$, and $\sigma_{t-1}^2$. Your question seems to be, how is $\omega$ different from $\sigma_{t-1}^2$?
First, note that $\omega$ is not the long-run variance; the latter actually is $\sigma_{LR}^2:=\frac{\omega}{1-(\alpha_1+\beta_1)}$. $\omega$ is an offset term, the lowest value the variance can achieve in any time period, and is related to the long-run variance as $\omega=\sigma_{LR}^2(1-(\alpha_1+\beta_1))$.
Second, $\sigma_{t-1}^2$ is not the historical variance of the moving window; it is instantaneous variance at time $t-1$.