Why does a covariance matrix have to be positive semi definite on an intuitive level

covarianceprobabilitystatistics

If I have a covariance matrix of some random vector $X$ with expectation $\mu \in \mathbb{R}^n$ it is not that difficult to show that its covariance matrix is positive semi definite; given
$$
Cov(X) = \mathbb{E}[(X-\mu) (X-\mu)^T]
$$

For any vector $z\in \mathbb{R}^n$ we have
$$
z^T Cov(X) z = z^T \mathbb{E}[(X-\mu) (X-\mu)^T] z = \mathbb{E}[z^T(X-\mu) (X-\mu)^T z]
$$

which is just the inner product squared
$$
\mathbb{E}[z^T(X-\mu) (X-\mu)^T z] = \mathbb{E}[<z,(X-\mu)>^2] \geq 0
$$

and hence always greater than or equal $0$.

I don't really understand however why a non PSD cannot function as a covariance matrix on an intuitive level.
Suppose you have a non-PSD matrix. Can you prove by contradiction that it can not be the covariance matrix of some random vector X?

Best Answer

Basically, I use the same argument as you from a slightly different perspective. Suppose that $\Sigma=\operatorname E[(X-\operatorname EX)(X-\operatorname EX)']$ is a covariance matrix of a random vector $X$ such that $\operatorname E\|X\|^2<\infty$ (this is just to ensure that the covariance matrix is well-defined). Now suppose that $\Sigma$ is not positive semi-definite. This means that there exists some vector $a$ such that $$ a'\Sigma a<0. $$ It follows that \begin{align*} a'\Sigma a &=a'\operatorname E[(X-\operatorname EX)(X-\operatorname EX)']a\\ &=\operatorname E[a'(X-\operatorname EX)(X-\operatorname EX)'a]\\ &=\operatorname{Var}(a'X)\\ &<0. \end{align*} Observe that $$ a'X = a_1X_1+\ldots+a_dX_d. $$ This means that there exists some linear combination of the entries of $X$ such that the variance of this linear combination is negative which of course does not make sense.

I hope this is useful.