I try to use `prediction_in_sample()`

in an ARIMA model (python package `pmdarima`

) to estimate the whole time series and `predict()`

to predict the next 10 data

But they behave very differently, e.g. `prediction_in_sample()`

has large variance, `predict()`

seems to only show trends, so `predict()`

has small variance

I wonder why?

```
model_110 = ARIMA(order=(1,1,0), out_of_sample_size=0, mle_regression=True, suppress_warnings=True)
data = np.array([-0.4470452846772659, 0.4631402100263472, 0.1610124334119578, 0.693340634810911,
-0.1316835900738694, 0.5341828623686271, 0.3124124027120894, -0.4245041188583057,
-0.1761953729537292, 0.9014044836766212, 0.5675295783826219, 0.858043348790692,
-0.4463359580978329, 0.0434157527905978, 0.1055733636541966, 0.062881261869083,
0.6713645070129255, -0.1639428418080044, 0.4039964402038722, 0.2404774387508368,
0.182584179546703])
model = model_110.fit(data)
pred_in_sample = list(model.predict_in_sample())
forecasts = model.predict()
temp_x = np.arange(len(data)+10)
plt.figure(figsize=(10,8))
plt.plot(temp_x[:len(data)], data, label="data")
plt.plot(temp_x[len(data):]-1, forecasts, label="forecast")
plt.plot(pred_in_sample, c ="red", label="pred-in-sample")
plt.legend()
plt.show()
plt.close()
```

## Best Answer

The "in-sample predictions" are

rolled1-ahead forecasts:$$p_t := \mathbb{E}(X_t | \mathcal{F}_{t-1})$$

But the out-of-sample predictions are $h$-ahead forecasts:

$$f_h := \mathbb{E}(X_{T+h} | \mathcal{F}_{T})$$

In particular, every in-sample prediction is conditioned on a

differentinformation set, whereas the out-of-sample predictions are conditioned on thesameone. They are entirely different objects.In-sample predictions are mostly used computationally for fitting and for assessing model fit. They are not real, out-of-sample forecasts.

It is a common mistake to think that out-of-sample forecasts

must"look like" the data. In practice, it is common in many fields to get better performance from simpler models like ARIMA, for example when reliably estimating a more complicated model is not feasible because data is limited.