Solved – Correctness of regression with ARIMA errors model and coefficient interpretation issues

arimaforecastingrregression coefficientstime series

I am trying to forecast electricity consumption in GWh for 2 years ahead (from June 2013 ahead), using R (the forecast package). For that purpose, I tried regression with ARIMA errors. I fitted the model using the auto.arima function, and I used the following variables in the xreg argument in the forecast.Arima function:

– Heating and Cooling Degree Days,
– Dummies for all 12 months and
– Moving holidays dummies (Easter and Ramadan)

I have several questions regarding the model:

1) Is it correct to use all 12 dummies for monthly seasonality, since when I tried to include 11, the function returned error. The Auto.arima function returned the model ARIMA(0,1,2)

2)The model returned the following coefficients (I won't specify all of them as there are too many coefficients):

ma1      ma2     HDD     CDD   January  February  March     April
-0.52 -0.16      0.27    0.12  525.84   475.13    472.57    399.01

I am trying to determine the influence of the temperature component over electricity load. In percentages, (interpreting the coefficients just as with the usual regression) the temperature components (HDD+CDD) account for 11,3% of the electricity consumption. Isn't this too little, considering the fact that the electricity consumption is mostly influenced by the weather component? On the other hand, taking look at the dummies' coefficients, it turns out that the seasonality accounts for the greater part of the load. Why is this? Is the model completely incorrect?

I tried linear regression, and the temperature component accounts for 20%, but it is still a low percentage. Why is this?

3) I am obviously making some mistakes in the use of forecast.Arima or the plot function parameters since when I plot the forecasts, I get a picture of the original time series which is continued (merged) with the forecasts for the whole time series period (from 2004 until 2015). I don't know how to explain this better, I tried to paste the picture, but it seems I cannot paste pictures here.

Best Answer

  1. In any regression model, including a regression with ARMA errors, you must specify one less dummy variable than the number of categories. Intuitively, this is because if you know the value of 11 monthly dummy variables, then you know the value of the 12th. So it provides no new information.

  2. There are two problems here. First, seasonality is confounded with the weather, so you cannot separate out their effects. Second, it is not possible to allocate a percentage contribution from each predictor unless the predictors are all orthogonal.

  3. The plotting method for forecast objects shows the historical data and the forecasts along with prediction intervals. Look at the help file to see how to modify the plot to your own purposes.

Related Question