Forecasting: Principles and Practice, 3rd edition by Hyndman and Athanasopoulos states in section 2.8, Autocorrelation that, for trend in data the autocorrelation function has positive values that slowly decrease as lag increases.
The definition for autocorrelation function for lag $k$ is defined here as –
$r_k=\frac{\sum_{t=k+1}^{T}{(y_t-\overline y)(y_{t-k}-\overline y)}}{\sum_{t=1}^T (y_t-\overline y)^2}$, where $T$ is the length of the time-series.
Can anyone give a rigorous proof and intuitive sense as to why it might be so, indifferent of the type of trend that the series might have?
P.S.: I am new to time-series; however, comfortable with maths.
Best Answer
I had the same doubt as you since I was introduced to time series analysis by the same book. Although it is useful to have an autocorrelation coefficient defined in that way, it may be a good idea to give some clarification in the book itself.
So, yes, as you said, the autocorrelation function decreases over the lag axis due to the fact that the global sample variance in the denominator remains the same and that - given a trended series - the actual sample variances of both the series used for the autocorrelation decrease w.r.t. the original one, since a bigger and bigger chunk of the variability at the extreme points of the series is not considered.
My idea is that this is useful mainly to distinguish a trended series from one that hovers around a constant value: without this artificial behaviour of the autocorrelation calculation we would not be able to distinguish the two in the ACF, since they would both have a constant ACF of about 1. This is not the case with the formulation using global sample variance instead of the actual sample variances of the lagged and cut versions.
Obviously, a non-linear trend induces a decreasing ACF either way.