I’m trying to model a time series (log_consommation) in a ARIMA(p,d,q) using Stata
.
So I start by determining d by transforming my time series to make it stationary.
My question is, when performing an augmented Dickey Fuller test to test stationarity, I have to choose the number of lags. Is this number of lags related to the p of the ARMA(p,q) model I will estimate later? How can it be determined without using dfgls
?
I have tried to use several different lags but I’m not sure how to choose:
I add a table of the results I obtain.
dfuller log_consommation, lags(0) regress
regress D.log_consommation l1.log_consommation
estat ic
dfuller log_consommation, lags(1) regress
regress D.log_consommation l1.log_consommation l1.(D.log_consommation)
estat ic
dfuller log_consommation, lags(2) regress
regress D.log_consommation l1.log_consommation l1.(D.log_consommation) l2.(D.log_consommation)
estat ic
dfuller log_consommation, lags(3) regress
regress D.log_consommation l1.log_consommation l1.(D.log_consommation) l2.(D.log_consommation) l3.(D.log_consommation)
estat ic
dfuller log_consommation, lags(4) regress
regress D.log_consommation l1.log_consommation l1.(D.log_consommation) l2.(D.log_consommation) l3.(D.log_consommation) l4.(D.log_consommation)
estat ic
I obtain :
Augmented Dickey-Fuller test regressions and test statistic
--
(1) (2) (3) (4) (5) (6)
No lag 1 lag 2 lags 3 lags 4 lags 5 lags
--
L.log_cons~n -0.0137*** -0.0105*** -0.00741*** -0.00678** -0.00662** -0.00574*
(-8.06) (-5.12) (-3.40) (-2.96) (-2.76) (-2.31)
LD.log_con~n 0.173* 0.131 0.118 0.113 0.108
(2.00) (1.50) (1.28) (1.21) (1.16)
L2D.log_co~n 0.283** 0.249** 0.245* 0.228*
(3.34) (2.82) (2.62) (2.43)
L3D.log_co~n 0.0608 0.0527 0.0217
(0.68) (0.57) (0.22)
L4D.log_co~n 0.0195 -0.00939
(0.22) (-0.10)
L5D.log_co~n 0.119
(1.32)
_cons 0.176*** 0.135*** 0.0952*** 0.0874** 0.0855** 0.0743*
(8.57) (5.34) (3.52) (3.07) (2.85) (2.39)
aic -888.5 -889.0 -890.1 -881.7 -871.4 -863.4
bic -882.8 -880.6 -878.9 -867.7 -854.7 -844.0
t_ADF -8.062 -5.118 -3.399 -2.965 -2.759 -2.310
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
Best Answer
Use
ac
andpac
in Stata to assess the possible lags. However, if you are using the ARMA model, it is normal to estimatearma
for the candidate models with p=0, q=1 and so on to p=3 and q=3. Then obtain theaic
andbic
. The model with the lowest aic or bic is chosen. The lags chosen by these criteria may differ, but you have to make sure that the residuals of these models are white noise at their chosen lags.