Solved – Performing a time series ARIMA model on natural gas power demand using the forecast package from R

arimaforecastingrtime series

I've been attempting to forecast natural gas power demand and how it is affected by temperature and price. I'm not sure if I have done everything correctly (relatively new to R), but I do seem to get relevant data other than I can't seem to change my forecast period, nor am I sure this is an appropriate model for this data. Hopefully someone can provide me with some guidance.

Data: demand.csv

library(forecast)
data = read.csv("demand.csv")

# Create matrix of numeric predictors
xreg <- cbind(weather=data$Weather,price=data$Price,m1=data$M1,
m2=data$M2,m3=data$M3,m4=data$M4,m5=data$M5,m6=data$M6,
m7=data$M7,m8=data$M8,m9=data$M9,m10=data$M10,m11=data$M11)

# Rename columns
colnames(xreg) <- c("Weather","Price","Jan","Feb","Mar","Apr",
"May","Jun","Jul","Aug","Sep","Oct","Nov")

# Variable to be modelled
demandTS <- ts(data$Demand, frequency=12)

# Find ARIMAX model
demandArima <- auto.arima(demandTS, xreg=xreg)
demand.fcast <- forecast(demandArima, xreg=xreg)
plot(demand.fcast)

Thank you for any help.

References:

How to setup xreg argument in auto ARIMA in R
From auto ARIMA to forecast in R

Best Answer

Unfortunately you have few technical errors here.

You cannot make ARIMAX-model with library(forecast) function auto.arima. Xreg argument makes it regression model with ARMA errors. That is something which I had to learn hard way by wondering the results... :)

And you have to supply FUTURE values for the xreg argument in the forecast-function. Split your data into two parts: 1) one to fit model 2) future values for the exogenous variables. Auto.arima does not forecast future values of xreg variables by ARIMA-models.

If you want to try ARIMAX-models try library(TSA) with arimax-function which of course has different syntax than auto.arima - function... :)

EDIT:

Here is example for using auto.arima with xreg argument with data set having first data for model parameters estimation and then for forecasting.

library(forecast);  
apu=read.table("demo_1.csv",sep=";",dec=",",header=TRUE);  
apux=read.table("demo_2.csv",sep=";",dec=",",header=TRUE);  
apuxx=read.table("demo_3_xreg.csv",sep=";",dec=",",header=TRUE);  
apu2=ts(data=apu[2],start=c(2011,1),deltat=1/365);  
apu3=ts(data=apu[3],start=c(2011,1),deltat=1/365);  
apu4=ts(data=apux[1],start=c(2011,1),deltat=1/365);  
Acf(apu2);  
Pacf(apu2);  
apu5=ts.intersect(apu3,apu4);  
apu6=ts(data=apuxx[3],start=c(2013,263),deltat=1/365);  
apu7=ts(data=apuxx[2],start=c(2013,263),deltat=1/365);  
apu8=ts.intersect(apu6,apu7);   


sarimax=auto.arima(apu2, d=NA, D=NA, max.p=5, max.q=5,  
     max.P=365, max.Q=365, max.order=5, start.p=2, start.q=2,  
     start.P=1, start.Q=1, stationary=FALSE, seasonal=TRUE,  
     ic=c("aicc","aic", "bic"), stepwise=TRUE, trace=FALSE,  
     approximation=(length(apu2)>100 | frequency(apu2)>12), xreg=apu5,  
     test=c("kpss","adf","pp"), seasonal.test=c("ocsb","ch"),  
     allowdrift=TRUE, lambda=0, parallel=FALSE, num.cores=NULL);  
print(sarimax$arma);  
    print(accuracy(sarimax));  
    print(sarimax$coef);  
plot(sarimax$residuals);    
    print(Box.test(sarimax$residuals,lag=30,type=c("Ljung-Box")));    

sarimaxpredicts=forecast(sarimax, h=7,level=c(75,80,90,95), fan=FALSE, xreg=apu8,  lambda=sarimax$lambda,bootstrap=FALSE, npaths=5000);  
plot(sarimaxpredicts);