Solved – Lag between predicted output and real output in time series prediction (directional prediction)

lagslogisticrandom walkregressiontime series

I modeled a directional prediction of a time series. In every step, I predict next direction of that series (up or down). Currently I have a lag in predicted outputs compared to real outputs.

For example, in above outputs, first figure is real direction and second figure is predicted direction (red star: next up trend, blue star: next down trend). So we can see that in prediction outputs we have 1-step lag. Totally, I have better results If I don't have this lag in prediction. I saw this link which mentions that this is a problem related to "naive predictor". We have same behavior in this problem (inputs: different lags of time series, output: 1 or 0)?
How can I resolve that? Currently, I'm using logistic regression in this model.

I checked my input data using unit root test. The input data was non-stationary so I transformed it to stationary using different methods (difference,detrend, etc.) but I have same problem. Is this same problem mentioned HERE?

Best Answer

So we can see that in prediction outputs we have 1-step lag. <...> How can I resolve that?

When you use historical data to predict future data, you will frequently find that the predictions appear to lag the realizations. Here is why; many macroeconomic and financial time series behave similarly to random walks (maybe with a drift, maybe with time-varying variance, but ultimately dominated by the random walk nature).

If the data generating process is a random walk, $$ x_t = x_{t-1} + \varepsilon_t $$ with $\varepsilon_t \sim i.i.d.(0,\sigma^2)$, the optimal* one-step-ahead prediction is $$ \hat x_{t+1|t}:=\mathbb{E}(x_{t+1}|x_{t},x_{t-1},\dots)=x_{t} $$ which happens to be the realization $x_{t+1}$ lagged by one ($x_t$ is $x_{t+1}$ lagged by 1). Given the data generating process, this apparently lagging prediction is the best we can get*. Then there is no way we can "fix" the "problem of lagging predictions; the problem is built-in due to the nature of the data generating process.

Is this same problem mentioned HERE?

Broadly, yes, although I am not an expert on real output (some more subject-matter knowledge could be helpful to determine how well the process can be approximated by a random walk).

** under square loss and/or directional 0-1 loss

Best Answer

Related Solutions

Solved – Evaluating Time Series Prediction Performance

Solved – Prediction using CRFs for time series data

Related Question