Solved – Is every correlation matrix positive definite

correlation matrixcovariance-matrixeigenvalues

I'm talking here about matrices of Pearson correlations.

I've often heard it said that all correlation matrices must be positive semidefinite. My understanding is that positive definite matrices must have eigenvalues $> 0$, while positive semidefinite matrices must have eigenvalues $\ge 0$. This makes me think that my question can be rephrased as "Is it possible for correlation matrices to have an eigenvalue $= 0$?"

Is it possible for a correlation matrix (generated from empirical data, with no missing data) to have an eigenvalue $= 0$, or an eigenvalue $< 0$? What if it was a population correlation matrix instead?

I read at the top answer to this question about covariance matrices that

Consider three variables, $X$, $Y$ and $Z = X+Y$. Their covariance
matrix, $M$, is not positive definite, since there's a vector $z$ ($=
(1, 1, -1)'$) for which $z'Mz$ is not positive.

However, if instead of a covariance matrix I do those calculations on a correlation matrix then $z'Mz$ comes out as positive. Thus I think that maybe the situation is different for correlation and covariance matrices.

My reason for asking is that I got asked over on stackoverflow, in relation to a question I asked there.

Best Answer

Correlation matrices need not be positive definite.

Consider a scalar random variable X having non-zero variance. Then the correlation matrix of X with itself is the matrix of all ones, which is positive semi-definite, but not positive definite.

As for sample correlation, consider sample data for the above, having first observation 1 and 1, and second observation 2 and 2. This results in sample correlation being the matrix of all ones, so not positive definite.

A sample correlation matrix, if computed in exact arithmetic (i.e., with no roundoff error) can not have negative eigenvalues.