Solved – Detect trend in time series

rtime seriestrend

Hypothesis: time series has an inverted-U shape.

How do we test this numerically?

My idea is to take the first difference of the variable and fit a linear
model using the differentiated variable as endogenous variable and the time
variable as exogenous.

$\Delta y_t = \beta_1 + \beta_2 t_t + \epsilon_t$

If the hypothesis is true, $\beta_2$ should be significantly less than zero.

If we try this approach with computer generated data, it can be seen that it
works well.

enter image description here

Call:
lm(formula = dy ~ dt)

Residuals:
       Min         1Q     Median         3Q        Max 
-1.219e-15 -2.520e-16 -2.218e-17  1.827e-16  1.241e-15 

Coefficients:
              Estimate Std. Error    t value Pr(>|t|)    
(Intercept)  1.210e-01  1.118e-16  1.082e+15   <2e-16 ***
dt          -2.000e-03  2.245e-18 -8.910e+14   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 4.988e-16 on 82 degrees of freedom
Multiple R-squared:     1,  Adjusted R-squared:     1 
F-statistic: 7.939e+29 on 1 and 82 DF,  p-value: < 2.2e-16 

However, if a slight amount of noise is added to the data, this method falls
apart catastrophically:

enter image description here

Call:
lm(formula = dy ~ dt)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.96480 -0.21802  0.00826  0.24701  0.93200 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.114305   0.087548   1.306    0.195
dt          -0.001922   0.001758  -1.093    0.277

Residual standard error: 0.3907 on 82 degrees of freedom
Multiple R-squared: 0.01437,    Adjusted R-squared: 0.002345 
F-statistic: 1.195 on 1 and 82 DF,  p-value: 0.2775 

So, what is the alternative?

Edit

R code to generate the series and the plots:

t <- 1:85
y <- 0.12 * t - 0.001 * t^2 + rnorm(length(t), sd=0.25)
dt <- tail(t, -1)
dy <- tail(y, -1) - head(y, -1)

plot(t, y, ylim=c(-0.5, 4), pch=19, col='navy')
points(dt, dy, pch=19, col='purple')
legend(x=3, y=3.5, c('y','first difference'), pch=19, col=c('navy','purple'))

summary(lm(dy ~ dt))

Best Answer

If you use lm then you should check the residuals to see if they are autocorrelated or not. I guess they are not uncorrelated and hence your t-test are not valid (this is true also for the case of summary(lm(y~t+I(t^2)). This is basiacally beacuse there is a time variable involved in your lm.

I recommend to use Generalized Least Square approach in order to test the quadratic effect and take into account the autocorrelated problem. For example if you assume the autoregressive of order two (see below) for the residuals of your lm (i.e. $e_t=\phi_1 e_{t-1}+\phi_2 e_{t-2}+\nu_t$, where $\nu_t$ is white noise), then the code would be like

library(nlme)
m1=gls(y~t+I(t^2),correlation=corARMA(p=2))
summary(m1)

Note: You should model the error terms correctly first (i.e. finding the order of $p$ and $q$) maybe by ckecking the ACF or PACF of the residuals in your lm. In above, I assumed AR(2). More complicated ARMA model can be considered and tested.