Proof clarification: For any r.v.s $X$ and $Y$, $-1 \le \text{Corr}(X, Y) \le 1$.

correlationcovarianceprobabilityproof-explanationvariance

My textbook, Introduction to Probability by Blitzstein and Hwang, gives the following theorem in a section on covariance and correlation:

Theorem 7.3.5 (Correlation bounds). For any r.v.s $X$ and $Y$,
$$
-1 \le\operatorname{Corr}(X, Y) \le 1.
$$

Proof. Without loss of generality we can assume $X$ and $Y$ have variance $1$, since scaling does not change the correlation. Let $\rho = \operatorname{Corr}(X, Y) = \operatorname{Cov}(X, Y)$. Using the fact that variance is nonnegative, along with property $7$ of covariance, we have
$$
\operatorname{Var}(X + Y)
= \operatorname{Var}(X) +\operatorname{Var}(Y) + 2\operatorname{Cov}(X, Y)
= 2 + 2\rho\ge 0,
$$

$$
\operatorname{Var}(X – Y)
= \operatorname{Var}(X) + \operatorname{Var}(Y) – 2 \operatorname{Cov}(X, Y)
= 2 – 2\rho\ge 0.
$$

Thus, $-1 \le \rho \le 1$.

Property 7 of covariance is given on a previous page as follows:

  1. $\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \text{Var}(Y) + 2\operatorname{Cov}(X, Y)$. For $n$ r.v.s $X_1,\dots, X_n$,
    $$
    \operatorname{Var}(X_1 +\dots + X_n)
    = \operatorname{Var}(X_1) +\dots +\operatorname{Var}(X_n) + 2\sum_{i < j} \operatorname{Cov}(X_i, X_j).
    $$

It's not clear to me why we conclude that $2 + 2\rho\ge 0$ and $2 – 2\rho\ge 0$. I'm thinking that my understanding is missing some knowledge about the connection between variance and the covariance $\rho$, because I can tell that $2 – 2\rho = 2(1 -\rho)\ge 0$ for $-1\le\rho \le 1$ (and analogously for $2 + 2\rho\ge 0$), but, assuming this correctly identifies where my misunderstanding is, it isn't clear to me why $-1\le\rho \le 1$.

I would greatly appreciate it if people could please take the time to clarify this.

Best Answer

It comes from a fact that variance is always nonnegative, since $$ \text{Var} X = \mathbb{E}\left[X - \mathbb{E}X\right]^2. $$

Related Question