ADF Test – Should a Trend Be Selected if the Trend Changes in Unit Root Testing?

augmented-dickey-fullermultiple regressionregressionunit root

Let's say you have a variable with a time series going back ten years. In the first 5 years, it clearly trends from 5 to 10. And, in the next 5 years, it trends downward from 10 back down to 5. When you test it for Unit Root using either Dickey-Fuller or the ADF test, should you use the ADF test with a trend or not?

Best Answer

I think the best test specification would be neither ADF with no trend nor ADF with a linear trend, because clearly none of the alternatives adequatly reflects the actual trend in the data.

You may consider using covariate-augmented Dickey-Fuller (CADF) test proposed in Hansen "Rethinking the Univariate Approach to Unit Root Testing: Using Covariates to Increase Power" (1995). Hansen's own R code for the test is available here. There is also an R package "CADFtest" by Claudio Lupi with a vignette and a reference manual which may be more readily usable than Hansen's code.

For the CADF test you would supply two regressors, t1=c(1:br,rep(0,T-br)) and t2=c(rep(0,br),1:(T-br)) to account for the two linear components of the trend, where br is the last point of the upward-trending period and T is the lenght of the data sample.

However, I am unsure how the use of t1 and t2 fits the stationarity requirement for the regressors. Since the trend components in t1 and t2 are nonstationary, they might mess up the null distribution of the parameter of interest in the CADF test regression. That could be a good argument for not using the CADF test in this situation.

If so, you could perhaps just split your sample into two parts and use the regular ADF test with a trend for each of them. It should be better than using the ADF test for the whole sample regardless of inclusion or exclusion of a linear trend. Doing the latter might well induce the ADF test to suggest presence of a unit root even if the process around this "broken trend" is actually stationary.

The last option for someone who is good at unit root asymptotics would be to derive the appropriate null distribution of the CADF test in this "broken trend" setting.

(Here is a somewhat related post.)

Related Solutions

Time Series Analysis in R – Interpreting Dickey-Fuller Unit Root Test Results (ur.df)

It seems the creators of this particular R command presume one is familiar with the original Dickey-Fuller formulae, so did not provide the relevant documentation for how to interpret the values. I found that Enders was an incredibly helpful resource (Applied Econometric Time Series 3e, 2010, p. 206-209--I imagine other editions would also be fine). Below I'll use data from the URCA package, real income in Denmark as an example.

> income <- ts(denmark$LRY)

It might be useful to first describe the 3 different formulae Dickey-Fuller used to get different hypotheses, since these match the ur.df "type" options. Enders specifies that in all of these 3 cases, the consistent term used is gamma, the coefficient for the previous value of y, the lag term. If gamma=0, then there is a unit root (random walk, nonstationary). Where the null hypothesis is gamma=0, if p<0.05, then we reject the null (at the 95% level), and presume there is no unit root. If we fail to reject the null (p>0.05) then we presume a unit root exists. From here, we can proceed to interpreting the tau's and phi's.

type="none": $\Delta y_t = \gamma \, y_{t-1} + e_t$ (formula from Enders p. 208)

(where $e_t$ is the error term, presumed to be white noise; $\gamma = a-1$ from $y_t = a \,y_{t-1} + e_t$; $y_{t-1}$ refers to the previous value of $y$, so is the lag term)

For type= "none," tau (or tau1 in R output) is the null hypothesis for gamma = 0. Using the Denmark income example, I get "Value of test-statistic is 0.7944" and the "Critical values for test statistics are: tau1 -2.6 -1.95 -1.61. Given that the test statistic is within the all 3 regions (1%, 5%, 10%) where we fail to reject the null, we should presume the data is a random walk, ie that a unit root is present. In this case, the tau1 refers to the gamma = 0 hypothesis. The "z.lag1" is the gamma term, the coefficient for the lag term (y(t-1)), which is p=0.431, which we fail to reject as significant, simply implying that gamma isn't statistically significant to this model. Here is the output from R

> summary(ur.df(y=income, type = "none",lags=1))
> 
> ############################################### 
> # Augmented Dickey-Fuller Test Unit Root Test # 
> ############################################### 
> 
> Test regression none 
> 
> 
> Call:
> lm(formula = z.diff ~ z.lag.1 - 1 + z.diff.lag)
> 
> Residuals:
>       Min        1Q    Median        3Q       Max 
> -0.044067 -0.016747 -0.006596  0.010305  0.085688 
> 
> Coefficients:
>             Estimate Std. Error t value Pr(>|t|)
> z.lag.1    0.0004636  0.0005836   0.794    0.431
> z.diff.lag 0.1724315  0.1362615   1.265    0.211
> 
> Residual standard error: 0.0251 on 51 degrees of freedom
> Multiple R-squared:  0.04696,   Adjusted R-squared:  0.009589 
> F-statistic: 1.257 on 2 and 51 DF,  p-value: 0.2933
> 
> 
> Value of test-statistic is: 0.7944 
> 
> Critical values for test statistics: 
>      1pct  5pct 10pct
> tau1 -2.6 -1.95 -1.61

type = "drift" (your specific question above): : $\Delta y_t = a_0 + \gamma \, y_{t-1} + e_t$ (formula from Enders p. 208)

(where $a_0$ is "a sub-zero" and refers to the constant, or drift term) Here is where the output interpretation gets trickier. "tau2" is still the $\gamma=0$ null hypothesis. In this case, where the first test statistic = -1.4462 is within the region of failing to reject the null, we should again presume a unit root, that $\gamma=0$.
The phi1 term refers to the second hypothesis, which is a combined null hypothesis of $a_0 = \gamma = 0$. This means that BOTH of the values are tested to be 0 at the same time. If p<0.05, we reject the null, and presume that AT LEAST one of these is false--i.e. one or both of the terms $a_0$ or $\gamma$ are not 0. Failing to reject this null implies that BOTH $a_0$ AND $\gamma = 0$, implying 1) that $\gamma=0$ therefore a unit root is present, AND 2) $a_0=0$, so there is no drift term. Here is the R output

> summary(ur.df(y=income, type = "drift",lags=1))
> 
> ############################################### 
> # Augmented Dickey-Fuller Test Unit Root Test # 
> ############################################### 
> 
> Test regression drift 
> 
> 
> Call:
> lm(formula = z.diff ~ z.lag.1 + 1 + z.diff.lag)
> 
> Residuals:
>       Min        1Q    Median        3Q       Max 
> -0.041910 -0.016484 -0.006994  0.013651  0.074920 
> 
> Coefficients:
>             Estimate Std. Error t value Pr(>|t|)
> (Intercept)  0.43453    0.28995   1.499    0.140
> z.lag.1     -0.07256    0.04873  -1.489    0.143
> z.diff.lag   0.22028    0.13836   1.592    0.118
> 
> Residual standard error: 0.0248 on 50 degrees of freedom
> Multiple R-squared:  0.07166,   Adjusted R-squared:  0.03452 
> F-statistic:  1.93 on 2 and 50 DF,  p-value: 0.1559
> 
> 
> Value of test-statistic is: -1.4891 1.4462 
> 
> Critical values for test statistics: 
>       1pct  5pct 10pct
> tau2 -3.51 -2.89 -2.58
> phi1  6.70  4.71  3.86

Finally, for the type="trend": $\Delta y_t = a_0 + \gamma * y_{t-1} + a_{2}t + e_t$ (formula from Enders p. 208)

(where $a_{2}t$ is a time trend term) The hypotheses (from Enders p. 208) are as follows:
tau: $\gamma=0$
phi3: $\gamma = a_2 = 0$
phi2: $a_0 = \gamma = a_2 = 0$
This is similar to the R output. In this case, the test statistics are -2.4216 2.1927 2.9343 In all of these cases, these fall within the "fail to reject the null" zones (see critical values below). What tau3 implies, as above, is that we fail to reject the null of unit root, implying a unit root is present. Failing to reject phi3 implies two things: 1) $\gamma = 0$ (unit root) AND 2) there is no time trend term, i.e., $a_2=0$. If we rejected this null, it would imply that one or both of these terms was not 0. Failing to reject phi2 implies 3 things: 1) $\gamma = 0$ AND 2) no time trend term AND 3) no drift term, i.e. that $\gamma =0$, that $a_0 = 0$, and that $a_2 = 0$. Rejecting this null implies that one, two, OR all three of these terms was NOT zero.
Here is the R output

> summary(ur.df(y=income, type = "trend",lags=1))
> 
> ############################################### 
> # Augmented Dickey-Fuller Test Unit Root Test # 
> ############################################### 
> 
> Test regression trend 
> 
> 
> Call:
> lm(formula = z.diff ~ z.lag.1 + 1 + tt + z.diff.lag)
> 
> Residuals:
>       Min        1Q    Median        3Q       Max 
> -0.036693 -0.016457 -0.000435  0.014344  0.074299 
> 
> Coefficients:
>               Estimate Std. Error t value Pr(>|t|)  
> (Intercept)  1.0369478  0.4272693   2.427   0.0190 *
> z.lag.1     -0.1767666  0.0729961  -2.422   0.0192 *
> tt           0.0006299  0.0003348   1.881   0.0659 .
> z.diff.lag   0.2557788  0.1362896   1.877   0.0665 .
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> Residual standard error: 0.02419 on 49 degrees of freedom
> Multiple R-squared:  0.1342,    Adjusted R-squared:  0.08117 
> F-statistic: 2.531 on 3 and 49 DF,  p-value: 0.06785
> 
> 
> Value of test-statistic is: -2.4216 2.1927 2.9343 
> 
> Critical values for test statistics: 
>       1pct  5pct 10pct
> tau3 -4.04 -3.45 -3.15
> phi2  6.50  4.88  4.16
> phi3  8.73  6.49  5.47

In your specific example above, for the d.Aus data, since both of the test statistics are inside of the "fail to reject" zone, it implies that $\gamma=0$ AND $a_0 = 0$, meaning that there is a unit root, but no drift term.

Unit Root – Best Practices for ADF/KPSS Unit Root Testing Sequence

The steps where the null hypothesis is rejected relates to the following processes:

Step 1.1 is related to (iii) is stationary around a linear trend,
Step 2.1 is related to (ii) is stationary around a non-zero mean,
Step 3.1 is related to (i) is stationary around a zero mean.

When the null is not rejected, then you may consider the processes (iv) a unit root with a zero drift, (v) a unit root with a non-zero drift or even a unit root with a linear trend. Be aware that the effect of an intercept or a linear trend in a random walk is not the same as in a stationary series. See this post for a graphical illustration of unit root processes with zero intercept (no drift), drift and trend.

If the null of a unit root is rejected, then the the $t$-statistic for $\mu=0$ would follow the standard distribution and you could test that $\mu=0$ under a Gaussian or Student-$t$ distribution of the test statistic. Nevertheless, I think it is a better idea what you mention in the last point, i.e., using the KPSS test where the null hypothesis is stationarity. In this way, combining the ADF and the KPSS tests we may arrive to strong conclusions rejecting either a unit root or stationarity.

In this post I summarize a sequential procedure of both tests and the conclusions that can be obtained in each case. In section 5 of this document we elaborate further on this approach in the context of seasonal time series (where the HEGY test plays the role of the ADF test and the CH test plays the role of the KPSS test).

Best Answer

Related Solutions

Time Series Analysis in R – Interpreting Dickey-Fuller Unit Root Test Results (ur.df)

Unit Root – Best Practices for ADF/KPSS Unit Root Testing Sequence

Related Question