Given $y = sx, E[(x-E[x])(y-E[y])] = E[(x-E[x])(sx-E[sx])]$ how to derivate $Cov(x, y) = 0$

probability distributions

Section 3.8 of "Ian Goodfellow and Yoshua Bengio and Aaron Courville. Deep Learning" says

suppose we first sample a real number $x$ from a
uniform distribution over the interval $[−1, 1]$. We next sample a random variable $s$. With probability $\frac{1}{2}$, we choose the value of $s$ to be $1$. Otherwise, we choose the value of $s$ to be $−1$. We can then generate a random variable $y$ by assigning $y = sx$. Clearly, $x$ and $y$ are not independent, because $x$ completely determines the magnitude of $y$. However, $Cov(x, y) = 0$.

how to derivate this?

$E[(x-E[x])(y-E[y])] = E[(x-E[x])(sx-E[sx])]$

Best Answer

It is not intuitive, but let us first state what the $Cov(x,y)$ is. Note that $s$ is independent

\begin{eqnarray} Cov(x,y) &=& E[xy] -E[x] E[y]\\ &=& E[x\cdot sx] - E[x] E[sx]\\ &=& E[s]\cdot (E[x^2] - E[x] E[x])\\ &=& 0 \cdot (Var(x))\\ \end{eqnarray}

For intuition purposes, the sign of the $Cov$ expresses the tendency in the linear relationship between the two, but since $s$ can change the sign independently, (one time you get x=0.3 and y=0.3 and another time you get x=0.3 and y=-0.3) there can be no tendency in the linear relationship. Sometimes it's negative and sometimes positive.