I modeled a directional prediction of a time series. In every step, I predict next direction of that series (up or down). Currently I have a lag in predicted outputs compared to real outputs.
For example, in above outputs, first figure is real direction and second figure is predicted direction (red star: next up trend, blue star: next down trend). So we can see that in prediction outputs we have 1-step lag. Totally, I have better results If I don't have this lag in prediction. I saw this link which mentions that this is a problem related to "naive predictor". We have same behavior in this problem (inputs: different lags of time series, output: 1 or 0)?
How can I resolve that? Currently, I'm using logistic regression in this model.
- I checked my input data using unit root test. The input data was non-stationary so I transformed it to stationary using different methods (difference,detrend, etc.) but I have same problem. Is this same problem mentioned HERE?
Best Answer
When you use historical data to predict future data, you will frequently find that the predictions appear to lag the realizations. Here is why; many macroeconomic and financial time series behave similarly to random walks (maybe with a drift, maybe with time-varying variance, but ultimately dominated by the random walk nature).
If the data generating process is a random walk, $$ x_t = x_{t-1} + \varepsilon_t $$ with $\varepsilon_t \sim i.i.d.(0,\sigma^2)$, the optimal* one-step-ahead prediction is $$ \hat x_{t+1|t}:=\mathbb{E}(x_{t+1}|x_{t},x_{t-1},\dots)=x_{t} $$ which happens to be the realization $x_{t+1}$ lagged by one ($x_t$ is $x_{t+1}$ lagged by 1). Given the data generating process, this apparently lagging prediction is the best we can get*. Then there is no way we can "fix" the "problem of lagging predictions; the problem is built-in due to the nature of the data generating process.
Broadly, yes, although I am not an expert on real output (some more subject-matter knowledge could be helpful to determine how well the process can be approximated by a random walk).
** under square loss and/or directional 0-1 loss