But is the one step ahead predictor not already defined as the value $\hat \sigma$ of the volatility that minimizes the MSE?
If you estimate the GARCH model using maximum likelihood then the fitted values $\hat\sigma_t$ are the likelihood-maximizing values (subject to the GARCH(1,1) functional form) which need not coincide with MSE-minimizing values. That depends on the distribution assumed for the likelihood calculation.
Also, when fitting the model on a data sample indexed from $1$ to $T$, the fitted value $\hat\sigma^2_t$ for $t<T$ utilizes the information not only from $1\leqslant\tau<t$ but also from $t\leqslant\tau\leqslant T$. Meanwhile, to fairly evaluate the forecast performance, you should not allow the estimate to be calculated using future data; instead, you need to check the forecast accuracy out of sample.
So why compute this measure if it going to be the minimum across models anyway?
Because you may want to know what the actual values of MSE is. Knowing that the MSE is minimal does not tell you what its value is. (See also the answer to the previous question.)
Moreover do I understand correctly that the practical way to compute the $MSE$...
That will be the mean squared forecast error. If you want in-sample MSE, just use the fitted values from the model estimated on the whole sample. The former should give an unbiased estimate of model performance, while the latter will be too optimistic in that respect.
Also I understand from this paper (Bollerslev 1998) that utilizing the squared daily return to approximate the realized volatility leads to noise.
Very important point indeed! The response variable of the GARCH model is measured with noise when squared errors are used as proxies; this noise may be quite substantial. When trying to assess model fit, the measurement error associated with the dependent variable may cause quite some trouble.
For example a better estimate of realized daily volatility would be the sum of 30 minutes squared returns of that day.
On the first thought, that could be a valid option. If that was proposed in the Andersen and Bollerslev (1998) paper, then it must be fine.
So to get a "better" MSE I could substitute every σi with the sum of 30 minutes squared returns of that day instead of simply the daily squared return? Is this how you use realized volatility to evaluate the goodness of your forecasts?
That makes sense to me.
I will end this rambling by asking for a good reference in evaluating the accuracy of the forecasts using realized volatility...
Regarding references, I think the Andersen and Bollerslev (1998) paper is quite relevant and complete. Also see Patton & Sheppard "Evaluating volatility and correlation forecasts" (2009) (free version here) and other works by Patton.
Here is an example notebook:
# In[1]:
import pandas as pd
import numpy as np
import pandas_datareader as pdr
import datetime
import arch
# In[2]:
start = datetime.datetime(1995, 1, 1)
df = pdr.data.DataReader('SPY', 'yahoo', start=start)
del df['Open']
del df['High']
del df['Low']
del df['Adj Close']
del df['Volume']
# In[3]:
df['log_price'] = np.log(df['Close'])
df['pct_change'] = df['log_price'].diff()
df.head()
# In[4]:
df['stdev21'] = df['pct_change'].rolling(window=21, center=False).std()
df['hvol21'] = df['stdev21'] * (252**0.5) # Annualize.
#df['variance'] = df['hvol21']**2
df = df.dropna() # Remove rows with blank cells.
# In[5]:
df.head()
# In[6]:
returns = df['pct_change'] * 100
am = arch.arch_model(returns)
# In[7]:
res = am.fit(disp='off')
res.summary()
# In[8]:
df['forecast_vol'] = 0.1 * np.sqrt(res.params['omega'] + res.params['alpha[1]'] * res.resid**2 + res.conditional_volatility**2 * res.params['beta[1]'])
# In[9]:
df.tail()
Best Answer
I understand that you want to evaluate volatility forecasts by comparing the forecasted standard deviation of the model error with the realized absolute value of the model error. This can be done by comparing the forecast of
sigma
from the GARCH output with the absolute difference between the point forecast and the realized value.But standard deviation is not equal to expected absolute value, so this may not be a good way of evaluating your GARCH forecasts. Alternatives could be,