Autocorrelation – Understanding Autocorrelation in Time-Series Data

autocorrelationtime series

To preface this, I have a pretty deep mathematical background, but I've never really dealt with time series, or statistical modeling. So you don't have to be very gentle with me 🙂

I'm reading this paper about modeling energy use in commercial buildings, and the author makes this claim:

[The presence of autocorrelation arises] because the model has been developed from time series data of energy use, which is inherently autocorrelated. Any purely deterministic model for time series data will have autocorrelation. Autocorrelation is found to reduce if [more Fourier coefficients] are included in the model. However, in most of the cases the Fourier model has low C.V. The model may, therefore, be aceptable for practical purposes that does (sic) not demand high precision.

0.) What does "any purely deterministic model for time series data will have autocorrelation" mean? I can vaguely understand what this means–for example, how would you expect to predict the next point in your time series if you had 0 autocorrelation? This isn't a mathematical argument, to be sure, which is why this is 0 🙂

1.) I was under the impression that autocorrelation basically killed your model, but thinking about it, I can't understand why this should be the case. So why is autocorrelation a bad (or good) thing?

2.) The solution I've heard for dealing with autocorrelation is to diff the time series. Without trying to read the author's mind, why would one not do a diff if non-negligible autocorrelation exists?

3.) What limitations do non-negligible autocorrelations place on a model? Is this an assumption somewhere (i.e., normally distributed residuals when modeling with simple linear regression)?

Anyway, sorry if these are basic questions, and thanks in advance for helping.

Best Answer

  1. I think the author is probably talking about the residuals of the model. I argue this because of his statement about adding more fourier coefficients; if, as I believe, he is fitting a fourier model, then adding more coefficients will reduce the autocorrelation of the residuals at the expense of a higher CV.

    If you have trouble visualizing this, think of the following example: suppose you have the following 100 points data set, which comes from a two-coefficient fourier model with addeded white gaussian noise:

    Datapoints

    The following graph shows two fits: one done with 2 fourier coefficients, and one done with 200 fourier coefficients:

    Fits

    As you can see, the 200 fourier coefficients fits the DATAPOINTS better, while the 2 coefficient fit (the 'real' model) fits the MODEL better. This implies that the autocorrelation of the residuals of the model with 200 coefficients will almost surely be closer to zero at all lags than the residuals of the 2 coefficient model, because the model with 200 coefficients fits exactly almost all datapoints (i.e., the residuals will be almost all zeros). However, what would you think will happen if you leave, say, 10 datapoints out of the sample and fit the same models? The 2-coefficient model will predict better the datapoints you leaved out of the sample! Thus, it will produce a lower CV error as opossed to the 200-coefficient model; this is called overfitting. The reason behind this 'magic' is because what CV actually tries to measure is prediction error, i.e., how well your model predicts datapoints not in your dataset.

  2. In this context, autocorrelation on the residuals is 'bad', because it means you are not modeling the correlation between datapoints well enough. The main reason why people don't difference the series is because they actually want to model the underlying process as it is. One differences the time series usually to get rid of periodicities or trends, but if that periodicity or trend is actually what you are trying to model, then differencing them might seem like a last resort option (or an option in order to model the residuals with a more complex stochastic process).
  3. This really depends on the area you are working on. It could be a problem with the deterministic model also. However, depending on the form of the autocorrelation, it can be easily seen when the autocorrelation arises due to, e.g., flicker noise, ARMA-like noise or if it is a residual underlying periodic source (in which case you would maybe want to increase the number of fourier coefficients).