Solved – Question about sample autocovariance function

mathematical-statisticsprobabilitytime series

I'm reading a time series analysis book and the formula for sample autocovariance is defined in the book as:

$$\widehat{\gamma}(h) = n^{-1}\displaystyle\sum_{t=1}^{n-h}(x_{t+h}-\bar{x})(x_t-\bar{x})$$

with $\widehat{\gamma}(-h) = \widehat{\gamma}(h)\;$ for $\;h = 0,1, …, n-1$. $\bar{x}$ is the mean.

Can someone explain intuitively why we divide the sum by $n$ and not by $n-h$? The book explains that this is because the formula above is a non-negative definite function and so dividing by $n$ is preferred, but this isn't clear to me. Can someone maybe prove this or show an example or something?

To me the intuitive thing at first would be to divide by $n-h$. Is this an unbiased or biased estimator of autocovariance?

Best Answer

$\widehat{\gamma}$ is used to create covariance matrices: given "times" $t_1, t_2, \ldots, t_k$, it estimates that the covariance of the random vector $X_{t_1}, X_{t_2}, \ldots, X_{t_k}$ (obtained from the random field at those times) is the matrix $\left(\widehat{\gamma}(t_i - t_j), 1 \le i, j \le k\right)$. For many problems, such as prediction, it is crucial that all such matrices be nonsingular. As putative covariance matrices, obviously they cannot have any negative eigenvalues, whence they must all be positive-definite.

The simplest situation in which the distinction between the two formulas

$$\widehat{\gamma}(h) = n^{-1}\sum_{t=1}^{n-h}(x_{t+h}-\bar{x})(x_t-\bar{x})$$

and

$$\widehat{\gamma}_0(h) = (n-h)^{-1}\sum_{t=1}^{n-h}(x_{t+h}-\bar{x})(x_t-\bar{x})$$

appears is when $x$ has length $2$; say, $x = (0,1)$. For $t_1=t$ and $t_2 = t+1$ it's simple to compute

$$\widehat{\gamma}_0 = \left( \begin{array}{cc} \frac{1}{4} & -\frac{1}{4} \\ -\frac{1}{4} & \frac{1}{4} \end{array} \right),$$

which is singular, whereas

$$\widehat{\gamma} = \left( \begin{array}{cc} \frac{1}{4} & -\frac{1}{8} \\ -\frac{1}{8} & \frac{1}{4} \end{array} \right)$$

which has eigenvalues $3/8$ and $1/8$, whence it is positive-definite.

A similar phenomenon happens for $x = (0,1,0,1)$, where $\widehat{\gamma}$ is positive-definite but $\widehat{\gamma}_0$--when applied to the times $t_i = (1,2,3,4)$, say--degenerates into a matrix of rank $1$ (its entries alternate between $1/4$ and $-1/4$).

(There is a pattern here: problems arise for any $x$ of the form $(a,b,a,b,\ldots,a,b)$.)

In most applications the series of observations $x_t$ is so long that for most $h$ of interest--which are much less than $n$--the difference between $n^{-1}$ and $(n-h)^{-1}$ is of no consequence. So in practice the distinction is no big deal and theoretically the need for positive-definiteness strongly overrides any possible desire for unbiased estimates.

Related Question