Covariance intuition

covarianceexpected valueprobability

My textbook, Introduction to Probability by Blitzstein and Hwang, says the following in a section on covariance and correlation:

Definition 7.3.1 (Covariance). The covariance between r.v.s $X$ and $Y$ is

$$\text{Cov}(X, Y) = E((X – EX)(Y – EY).$$

Multiplying this out and using linearity, we have an equivalent expression:

$$\text{Cov}(X, Y) = E(XY) – E(X)E(Y).$$

Let's think about the definition intuitively. If $X$ and $Y$ tend to move in the same direction, then $X – EX$ and $Y – EY$ will tend to be either both positive or both negative, so $(X – EX)(Y – EY)$ will be positive on average, giving a positive covariance. If $X$ and $Y$ tend to move in opposite directions, then $X – EX$ and $Y – EY$ will tend to have opposite signs, giving a negative covariance.

If $X$ and $Y$ are independent, then their covariance is zero. We say that r.v.s with zero covariance are uncorrelated.

  1. The intuition with regards to zero covariance is clear to me, since I assume that zero covariance implies that the random variables $X$ and $Y$ are independent, and so $E(XY) = E(X)E(Y)$, right?

  2. Why will $X – EX$ and $Y – EY$ tend to be either both positive or both negative if $X$ and $Y$ tend to move in the same direction?

  3. Why will $X – EX$ and $Y – EY$ tend to opposite signs if $X$ and $Y$ tend to move in opposite directions?

I would greatly appreciate it if people could please take the time to clarify this points.

Best Answer

This example may help you interpret what "move in the same direction" or "move in the opposite direction" might mean.

Let $(X,Y) = (1,1)$ with probability $1/2$, and $(X,Y) = (-1,-1)$ otherwise. Then, $\text{Cov(X,Y)} = E[XY] = 1$. In this example, $X$ and $Y$ move in the same direction together. (Indeed, we have that $X=Y$.)

Suppose instead that $(X,Y) = (1,-1)$ with probability $1/2$, and $(X,Y) = (-1,1)$ otherwise. Then, $\text{Cov}(X,Y) = E[XY] = -1$, and $X$ and $Y$ tend to move in opposite directions.

More generally (still restricting ourselves to mean zero random variables), when $X$ and $Y$ are positively correlated, so that they tend to move together in the same direction, then on average, whenever $X$ is positive then so is $Y$, and whenever $X$ is negative and so is $Y$ (and vice-versa), so that we have that $XY > 0$ on average. You can interpret negative correlation similarly.

If we want to extend this interpretation to random variables with non-zero mean, then consider the above intuition for the random variables $X-E[X]$ and $Y-E[Y]$.