When building a time series model is there a difference both from a theoretical perspective and a practical performance perspective to train a one-step ahead prediction model and forecast one-by-one in to the future for N steps vs to train directly an N-step ahead model?
If the purpose is to forecast N steps into the future, would an N-step ahead model have any performance advantages?
Best Answer
If the model is correct, then the optimal forecast is given by the iterated forecast (i.e. when you forecast each intermediate $y_{T+k}$ to finally produce $\hat y_{T+h}$). The direct forecast (when you estimate the model with $y_t$ as a function of $y_{t-h}$ in which the 'one'-step-ahead forecast is now a $h$-step ahead forecast in 'physical' time) is less efficient in this case, but on the upside it is more robust to model misspecification.
Marcellino, Stock and Watson investigated this (in the AR context) in more detail and the abstract reads:
A free version of their paper is available here: https://www.princeton.edu/~mwatson/papers/hstep_3.pdf
Massimiliano Marcellino, James H. Stock, Mark W. Watson (2006) "A comparison of direct and iterated multistep AR methods for forecasting macroeconomic time series", Journal of Econometrics, (135):1–2, 499-526, https://doi.org/10.1016/j.jeconom.2005.07.020.