Hypothesis: time series has an inverted-U shape.
How do we test this numerically?
My idea is to take the first difference of the variable and fit a linear
model using the differentiated variable as endogenous variable and the time
variable as exogenous.
$\Delta y_t = \beta_1 + \beta_2 t_t + \epsilon_t$
If the hypothesis is true, $\beta_2$ should be significantly less than zero.
If we try this approach with computer generated data, it can be seen that it
works well.
Call:
lm(formula = dy ~ dt)
Residuals:
Min 1Q Median 3Q Max
-1.219e-15 -2.520e-16 -2.218e-17 1.827e-16 1.241e-15
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.210e-01 1.118e-16 1.082e+15 <2e-16 ***
dt -2.000e-03 2.245e-18 -8.910e+14 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 4.988e-16 on 82 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 7.939e+29 on 1 and 82 DF, p-value: < 2.2e-16
However, if a slight amount of noise is added to the data, this method falls
apart catastrophically:
Call:
lm(formula = dy ~ dt)
Residuals:
Min 1Q Median 3Q Max
-0.96480 -0.21802 0.00826 0.24701 0.93200
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.114305 0.087548 1.306 0.195
dt -0.001922 0.001758 -1.093 0.277
Residual standard error: 0.3907 on 82 degrees of freedom
Multiple R-squared: 0.01437, Adjusted R-squared: 0.002345
F-statistic: 1.195 on 1 and 82 DF, p-value: 0.2775
So, what is the alternative?
Edit
R code to generate the series and the plots:
t <- 1:85
y <- 0.12 * t - 0.001 * t^2 + rnorm(length(t), sd=0.25)
dt <- tail(t, -1)
dy <- tail(y, -1) - head(y, -1)
plot(t, y, ylim=c(-0.5, 4), pch=19, col='navy')
points(dt, dy, pch=19, col='purple')
legend(x=3, y=3.5, c('y','first difference'), pch=19, col=c('navy','purple'))
summary(lm(dy ~ dt))
Best Answer
If you use lm then you should check the residuals to see if they are autocorrelated or not. I guess they are not uncorrelated and hence your t-test are not valid (this is true also for the case of summary(lm(y~t+I(t^2)). This is basiacally beacuse there is a time variable involved in your lm.
I recommend to use Generalized Least Square approach in order to test the quadratic effect and take into account the autocorrelated problem. For example if you assume the autoregressive of order two (see below) for the residuals of your lm (i.e. $e_t=\phi_1 e_{t-1}+\phi_2 e_{t-2}+\nu_t$, where $\nu_t$ is white noise), then the code would be like
Note: You should model the error terms correctly first (i.e. finding the order of $p$ and $q$) maybe by ckecking the ACF or PACF of the residuals in your lm. In above, I assumed AR(2). More complicated ARMA model can be considered and tested.