I'm talking here about matrices of Pearson correlations.
I've often heard it said that all correlation matrices must be positive semidefinite. My understanding is that positive definite matrices must have eigenvalues $> 0$, while positive semidefinite matrices must have eigenvalues $\ge 0$. This makes me think that my question can be rephrased as "Is it possible for correlation matrices to have an eigenvalue $= 0$?"
Is it possible for a correlation matrix (generated from empirical data, with no missing data) to have an eigenvalue $= 0$, or an eigenvalue $< 0$? What if it was a population correlation matrix instead?
I read at the top answer to this question about covariance matrices that
Consider three variables, $X$, $Y$ and $Z = X+Y$. Their covariance
matrix, $M$, is not positive definite, since there's a vector $z$ ($=
(1, 1, -1)'$) for which $z'Mz$ is not positive.
However, if instead of a covariance matrix I do those calculations on a correlation matrix then $z'Mz$ comes out as positive. Thus I think that maybe the situation is different for correlation and covariance matrices.
My reason for asking is that I got asked over on stackoverflow, in relation to a question I asked there.
Best Answer
Correlation matrices need not be positive definite.
Consider a scalar random variable X having non-zero variance. Then the correlation matrix of X with itself is the matrix of all ones, which is positive semi-definite, but not positive definite.
As for sample correlation, consider sample data for the above, having first observation 1 and 1, and second observation 2 and 2. This results in sample correlation being the matrix of all ones, so not positive definite.
A sample correlation matrix, if computed in exact arithmetic (i.e., with no roundoff error) can not have negative eigenvalues.