I am doing predictions on monthly temperature data for 100 years, from 1901 to 2000 (i.e 1200 data points). I want to know if the method I follow is correct because in my output, I do not see the requisite "randomness" of temperature being reproduced in the prediction.
Here is a link to the plot of the prediction (in red)
https://docs.google.com/file/d/0B1Lm03a_91xiYks5TVJDYU05VUE/edit?usp=sharing
EDIT: added the ACF and PACF of the detrended and de-seasonalised time series:
https://docs.google.com/file/d/0B1Lm03a_91xia2RTOHZrajJtZXM/edit?usp=sharing
Below is the dput() of my data:
> dput(fr.monthly.temp.ts)
structure(c(2.7, 0.4, 4.7, 10, 13, 16.9, 19.2, 18.3, 15.7, 10.6,
4.9, 3.5, 4.1, 3.2, 7.5, 10.3, 10, 15.1, 18.2, 17.4, 15, 10.2,
6.3, 3.5, 3.8, 5.9, 7.6, 7.1, 12.9, 14.9, 17.6, 17.3, 15.5, 12.1,
6.9, 2.7, 3, 4.6, 5.5, 10.3, 13.6, 16.3, 20.2, 18.5, 13.9, 11.2,
5.4, 4.8, 1.7, 4, 7.4, 9.3, 11.9, 16.5, 20, 17.6, 14.7, 8.4,
5.5, 3.8, 4.3, 3.1, 5.6, 8.5, 12.6, 16.1, 18.2, 18.9, 16, 12.7,
7.4, 2.3, 2.5, 2.1, 6.3, 8.4, 12.7, 15.1, 16.5, 17.9, 16.2, 11.6,
7.6, 5.6, 1.7, 4.8, 5, 7.7, 14.2, 16.8, 17.9, 17.1, 14.8, 12.1,
6.5, 3.6, 2.2, 2, 4.7, 10.4, 12.8, 14.2, 16.3, 18, 14.2, 12.2,
5, 4.9, 4, 5.4, 6.6, 8.5, 11.9, 16.1, 16.4, 17.3, 14.2, 11.9,
5.9, 6, 1.6, 4.5, 6.4, 8.3, 13.6, 16.1, 20.8, 20.7, 17.5, 11.3,
7.3, 6.6, 4.6, 6.8, 8.4, 9.2, 13.8, 15.5, 17.9, 15.5, 12.5, 10,
5.5, 5.8, 5.4, 4.7, 7.9, 9.1, 13, 15.8, 16.5, 17.6, 15.4, 12.3,
9.2, 4, 0.7, 6.5, 7.4, 11.2, 12.2, 15.3, 17.3, 18.2, 15.3, 10.6,
6.3, 5.7, 3.5, 4.3, 5.7, 8.5, 14.2, 17, 17.2, 17.5, 14.7, 9.6,
4.6, 7, 6.4, 4.8, 5.9, 9.5, 13.8, 14, 17.4, 18.4, 14.5, 11.5,
7, 4.3, 1.1, 1.4, 4.4, 6.7, 15.1, 17.6, 18.3, 17.2, 16.4, 9.4,
7.3, 1.4, 3.7, 5.4, 6.5, 8.4, 14.2, 15, 18, 18.1, 15.4, 9.7,
6.4, 6.9, 3.3, 3.7, 6.2, 7.8, 13.8, 16.3, 15.9, 18.9, 16.2, 8.8,
4.6, 5.5, 5, 6.4, 8.2, 9.9, 14.4, 16, 17.4, 16.5, 15.2, 11.5,
6, 4, 6.4, 4.2, 7.2, 8.9, 13.7, 16.9, 20.6, 18, 17, 14.1, 4.7,
4.5, 3.4, 4.7, 6.6, 8, 14.8, 16.3, 16.7, 16.9, 13.7, 9.2, 5.4,
4.5, 3.7, 6.3, 7.6, 9.4, 12.2, 14.1, 19.9, 18.8, 15.1, 12.3,
5.3, 3.8, 3.8, 2.4, 6.4, 9.2, 14.1, 16.2, 18, 15.9, 15.2, 11.7,
7.1, 4.5, 4.8, 5.6, 4.3, 9.1, 12.9, 17, 18, 17.6, 13.3, 11.8,
4.9, 3.9, 4.1, 8.3, 7.2, 10.3, 11.6, 14.5, 18.2, 18.7, 17.3,
11.5, 8.3, 2.5, 4.3, 4.5, 7.7, 9.8, 13.7, 15.7, 18, 17.8, 15.2,
11.3, 6.7, 2.9, 5, 6.4, 7.1, 9.3, 11.8, 16.1, 20.5, 19.3, 15.8,
11.5, 8.2, 3.7, 0.3, -0.2, 6.7, 7.8, 13.2, 16.3, 19.1, 18.1,
18.4, 11.4, 7.3, 6.4, 5.8, 3.3, 7, 9.7, 12.1, 17.7, 17.3, 18.2,
15.9, 11.9, 8.6, 4.5, 3.7, 3.3, 5.8, 8.8, 13.8, 17.5, 17.7, 17,
12.8, 10.6, 8.2, 3.2, 4.8, 1.4, 5.5, 8, 12.1, 15.8, 17.4, 20.4,
17.2, 11, 7.4, 5, 1.8, 4.3, 7.8, 10.1, 13.1, 15.4, 19.5, 20.1,
16.7, 12, 5.5, 0.3, 3.3, 3.1, 6.3, 10.4, 13.8, 17.2, 20, 17.5,
17.1, 11.9, 5.8, 7.6, 2.6, 5.1, 6.2, 9.1, 11.6, 17.2, 19.5, 18.1,
16.1, 10.7, 7, 3.9, 6.5, 4.6, 7.9, 8.3, 13.4, 16.1, 17.2, 18,
16, 9.1, 6.6, 4.2, 5.3, 6.9, 5.6, 9.9, 14.2, 16.6, 18.6, 19.1,
15.5, 11.7, 6.3, 3.2, 4.4, 3.9, 8.8, 7.7, 11.7, 16.8, 17.5, 18.2,
15.6, 11.3, 9.3, 2.5, 5.3, 4.7, 5.4, 10.2, 11.5, 16.4, 17.3,
18.1, 15.2, 10.3, 8.7, 2.6, -0.9, 4.5, 7.1, 9.6, 13.5, 17.1,
17.1, 17.5, 15.6, 10.6, 7.6, 1.1, 0.7, 4.5, 7.3, 8.2, 10.3, 16.8,
19.3, 16.9, 15.5, 10.8, 6.6, 3.7, -0.2, -0.1, 7.7, 10.6, 13.1,
16.7, 18.1, 18.7, 16.7, 13.2, 5.5, 4.8, 4.8, 5.3, 8, 11.5, 14.2,
16.4, 19.2, 19.2, 16, 12.4, 5.9, 3.4, 5.1, 2.2, 5.1, 11.1, 13.4,
16, 18.6, 20.6, 15.2, 10.1, 7.1, 3.4, -1, 7.1, 8.4, 11.9, 14.8,
17.8, 20, 18.1, 16.7, 12.3, 6.5, 4.8, 1.7, 6.4, 6.7, 11.2, 13.1,
15.7, 18.9, 17.9, 16.2, 11.3, 7.1, 2.1, 1, 1.3, 7.3, 11.3, 14.8,
17.9, 20.4, 20.9, 17.6, 12.1, 8.3, 3.8, 5.7, 4.5, 9.5, 10.4,
14, 15.8, 17, 17.8, 15.5, 11.4, 7.2, 4.6, 4.5, 5.4, 5.7, 11.7,
12.2, 16.8, 20.6, 19.8, 18.6, 13.4, 6.4, 5.1, 3, 6.4, 8, 8.7,
14.2, 18.3, 20.2, 18.6, 15.2, 11.4, 7.4, 1.1, 4.6, 4.7, 5.8,
9.1, 11.8, 16.1, 18.7, 17.5, 16.5, 10.5, 8.7, 4.9, 2.7, 2.8,
8.1, 11.2, 14.5, 17.9, 20.2, 18.9, 13.1, 10.9, 5.5, 3.5, 1.1,
3, 7.5, 10.1, 14.8, 15.4, 18, 18.8, 16.2, 12.1, 7, 6.8, 1.7,
2.3, 7.5, 8.6, 12.6, 16, 16.4, 16.9, 15.5, 12.4, 8, 6.2, 4.4,
3.6, 4.6, 10.3, 12.5, 16.4, 19.1, 19.2, 15.7, 10.4, 6.7, 6.4,
4.4, -1.8, 6.7, 8.1, 13.8, 14.4, 17.8, 16.4, 16.4, 10.6, 5.3,
5.2, 3.1, 6.9, 9.8, 9.6, 11.5, 17, 18.5, 17.6, 15.1, 11.8, 6.8,
3.6, 3.7, 6.2, 4.9, 7.9, 13.9, 15.6, 17.9, 18.4, 17.3, 11.4,
6.7, 5.1, 3.4, 4.5, 8.6, 10.2, 13.8, 17, 20.3, 18.9, 17.2, 12.2,
6.8, 5.7, 3.5, 5, 8, 9.6, 14.5, 17.6, 16.8, 17.3, 14.5, 11.1,
8.4, 3.5, 3.6, 7.6, 8.3, 11.7, 12.5, 16.6, 17.7, 18, 18.5, 12.3,
6.4, 4.5, 4.8, 3.7, 3.9, 9.1, 11.5, 15.8, 17.6, 18.6, 15.5, 11.9,
5.4, 1.3, -1.6, -0.3, 6.5, 9.6, 12.2, 15.8, 18.5, 16.5, 15.2,
11.5, 9.3, 1.3, 1.5, 5.2, 5.6, 9.6, 14.5, 16.8, 19.6, 18.2, 16.7,
9.6, 7.2, 3.2, 3.6, 1.7, 6.6, 8.7, 12.7, 16.1, 16.7, 17.1, 13.7,
12.2, 6.3, 5.7, 2.6, 7.9, 6.2, 10.5, 13.2, 17, 16.8, 17.2, 16.6,
12.7, 5, 5.3, 3.5, 5.5, 7.7, 8.8, 12.5, 15.6, 19.8, 18.1, 15.3,
13.2, 7.1, 3, 3.3, 4.3, 6.8, 9.9, 11.8, 15.9, 17.8, 17.2, 15.1,
13.5, 6.8, 3, 4.8, 2.1, 6.2, 9.2, 13.2, 15, 19.1, 18.1, 15.9,
13.1, 7.1, 1.4, 4.1, 4.3, 4.4, 7.6, 12.8, 17.6, 17.8, 18.3, 16.6,
11.3, 8.7, 2.6, 3.1, 4.2, 3.8, 10.5, 13.7, 14.8, 19.7, 18.7,
15.7, 12.3, 5.8, 4.9, 3.2, 5.5, 7.9, 8.9, 11.7, 14.3, 18, 17.1,
13.3, 10.9, 7.3, 4.5, 3, 3.4, 6.1, 7.6, 13.5, 17, 18.1, 19.9,
16.7, 10.6, 6.8, 3.7, 6.2, 5.5, 7.3, 9.4, 12.5, 15.9, 17.7, 18.6,
14.5, 8.2, 7.4, 6.8, 6.4, 5.5, 5.3, 9, 12.1, 15.9, 19.1, 19.8,
16.1, 10.4, 6.7, 3.1, 4.1, 4.8, 6, 8.9, 14, 18.8, 20.1, 19, 14.8,
11.8, 6.6, 3.1, 3.7, 6.6, 8.3, 8.3, 12.1, 14.8, 17.8, 16.9, 14.7,
12.9, 7, 5.3, 3.3, 4, 7.2, 7.8, 12.3, 15.2, 17.3, 17.2, 15.6,
11.8, 6.7, 5.1, 1.3, 4, 6.6, 8.2, 12.3, 16.5, 18.5, 17.1, 15.7,
12.4, 6.7, 5.7, 2.2, 6.3, 6.2, 8.4, 11.9, 15, 16.4, 18.6, 16.5,
10.8, 5.8, 3.1, 3.3, 2.9, 9.2, 10, 12.6, 16, 17.5, 18.8, 16.2,
11.2, 7.2, 3.8, 4.6, 5, 6.3, 9.3, 13.4, 17.4, 20.1, 18, 17.4,
11.4, 8.3, 4.9, 5.5, 2.5, 7, 8.9, 11.5, 17.1, 22.2, 19.3, 16.3,
11.9, 7.6, 4.5, 4.2, 3.5, 5.2, 9.6, 10.4, 15.8, 18.8, 18.4, 14.7,
11.9, 9, 4.5, -1, 3.8, 5.2, 9.7, 12.5, 15.3, 19.4, 17.6, 17.3,
12.3, 4.4, 5.6, 3.9, -0.6, 5.9, 6.9, 13.7, 16.9, 18.7, 17.6,
14.9, 13.1, 7.9, 5, -0.8, 3.7, 4.8, 10.9, 11.4, 15, 18.6, 18.6,
17.8, 12.4, 7.1, 5.2, 6.4, 4.9, 6.5, 10.1, 13.8, 16.2, 17.8,
18.7, 15.7, 12.9, 6.3, 6, 4.2, 5.6, 9.3, 8.2, 15.3, 16.9, 20.2,
19.5, 16.5, 13.2, 7, 5.6, 4.8, 8.8, 8.7, 8.9, 15.3, 16, 19.7,
20.4, 15.9, 13.3, 7.2, 3.1, 3.9, 1.9, 9, 8.7, 11.7, 14.9, 19.6,
20.7, 17.9, 10.9, 6.9, 3.6, 2.8, 4.9, 7.6, 9.5, 15.3, 16.1, 19.1,
19.9, 15.5, 9.6, 9, 4.8, 5.9, 3.5, 7, 10.4, 14.1, 17.3, 17.8,
18.7, 14.7, 10.4, 4.8, 6.2, 5.2, 5.1, 9.4, 8.7, 13.6, 17.1, 21.4,
19.9, 15, 12, 10.2, 6.5, 4.5, 7.5, 6.5, 9.9, 13.6, 16.1, 21.1,
20.2, 14.5, 14.6, 7.5, 3.8, 5, 2.9, 6, 10, 12.2, 17.5, 18.7,
18.2, 14.2, 11.9, 6.9, 3.4, 2.3, 6.9, 9.3, 10, 14.2, 16.3, 18.6,
21, 17, 12.4, 8.4, 5.5, 5, 5.9, 8.1, 9, 14.9, 17, 18.5, 19.4,
16.1, 11.6, 5.2, 4.5, 5.3, 4.3, 8, 10, 15.2, 16.3, 20.2, 19.4,
17.9, 12.2, 6.4, 5, 3.7, 6.6, 7.5, 9.9, 15, 17.8, 17.5, 19.6,
16.9, 12.2, 8.2, 7.1), .Tsp = c(1901, 2000.91666666667, 12), class = "ts")
I run stl()
on it to remove the seasonality:
# calculate and remove the seasonality
fr.monthly.temp.ts.stl <- stl(fr.monthly.temp.ts, s.window="periodic") # get the components
fr.monthly.temp.seas <- fr.monthly.temp.ts.stl$time.series[,"seasonal"]
#plot(fr.monthly.temp.seas)
fr.monthly.temp.ts.noseas <- fr.monthly.temp.ts - fr.monthly.temp.seas
#plot(fr.monthly.temp.ts.noseas)
Then remove the trend with a regression:
fr.mtrend.noseas <- lm(fr.monthly.temp.ts.noseas~t)
summary(fr.mtrend.noseas)
and then use the residuals of this model to fit an ARIMA model (after checking the ACF and PACF for which one is appropriate):
# create time series of residuals..this is our "detrended" series..for now use only linear trend result
fr.monthly.temp.ts.new <- ts(fr.mtrend.noseas$resid, start=c(1901,1), frequency=12)
#plot.ts(fr.monthly.temp.ts.new, main="Detrended and de-seasonalized time series")
# ARIMA 1,1,1
fit6 <- arima(fr.monthly.temp.ts.new,order=c(1,1,1))
fit6
tsdiag(fit6)
I then make a prediction on the stationary time series:
#forecast for the stationary TS, for next 50 yrs months
forecast <- predict(fit6,n.ahead=600)
And then add back the trend and seasonality:
t.new <- (n+1):(n+600)
#initial time series = stationaryTS + seasonality + trend
fr.monthly.temp.ts.init <- fr.monthly.temp.ts.new + fr.monthly.temp.seas +
fr.mtrend.noseas$coefficients[1] + t * fr.mtrend.noseas$coefficients[2]
#same for the prediction: we need to add seasonality and trend
pred.Xt <- forecast$pred + fr.monthly.temp.seas[1:(1+50*12 - 1)] +
fr.mtrend.noseas$coefficients[1] + t.new * fr.mtrend.noseas$coefficients[2]
plot(fr.monthly.temp.ts.init,type="l",xlim=c(1940,2060))
lines(pred.Xt,col="red",lwd=2)
So going back to my question: Do I need to add some white noise to the prediction to be able to realistically predict temperature? And more generally, is my method correct?
Best Answer
I think you're doing this correctly. The predictions show the deterministic component of the model, as intended. As a loose analogy, if you add a trend line to a simple scatter plot
(abline(lm(...)
, e.g.), you wouldn't expect the trend line to wobble around. Similarly, the forecast represents the best guess at the future temperatures.Were you perhaps interested in a stochastic simulation, or bounding the forecast estimates with confidence intervals? Or maybe there is another seasonal component in the observed time series that is missing from your forecast? I'll elaborate on these three possibilities below.
forecast$se
. Multiply that by the appropriate critical value, e.g.,qnorm(0.975)
, and plot using lines(), and you can add some confidence bands.diffinv(rnorm(40))
). Your model has more terms, so it's not so simple. Try something akin toX[i] <- B1*X[i-1] + Sigma*rnorm(1)
, where Sigma is the s.d. of the noise that you want to add, B1 is the AR(1) coefficient, and X is the response variable. Adjust as necessary to include your seasonal and MA terms, etc. You could choose Sigma based on the residual variance of your fitted time series model.To directly answer your questions:
Do you need to add a white noise term? No. If it's noise you want to add, do a stochastic simulation, which would allow the noise to propagate naturally (a shock at time t-1 will have an effect at time t b/c X[t] is correlated with X[t-1]; also, the MA term means that the shocks/errors are correlated b/c Epsilon[t] is correlated with Epsilon[t-1]).
Is you method correct? Given the structure of your model, your forecast seems reasonable. If the forecast is lacking some deterministic pattern, try accounting for things like NAO. If you desire an explicit graphical representation of uncertainty (due to lack of additional terms, or other process/ observation errors), I would suggest communicating that the model upon which your forecast is based has some residual variance --- put s.e.'s or CI's on the forecast.
My hunch is that you're mostly interested in doing a stochastic simulation. That would be the way to add randomness.