Solved – Intuitive reason why jointly normal and uncorrelated imply independence

correlationindependenceintuitionmathematical-statisticsnormal distribution

It is well-known that if two random variables are jointly normal and uncorrelated, then they are independent. Does anyone have an intuitive reason why this is true? Explanation in terms of data is greatly appreciated.

Updates:

After seeing the answer by @kjetil b halvorsen, I feel I should give some more details. As far I know, that result (uncorrelated implies independence) is true if the variables are Bernoulli or they are jointly normal. I would like to stress that I completely understand the math behind the truthfulness of the mentioned result. When they are jointly normal, $\rho = 0$, so the joint density will factor into two, one a function of $x$ alone and another function of $y$ alone, hence independent. What I am looking for is some intuition why is this the case? For example, in case of Bernoulli, the only possible data are $(0,0), (0,1), (1,0), \text{ or } (1,1)$. So, if there is no linear relationship between them (uncorrelated), then there is none (independent)! I am just wondering whether there is anything like this?

Best Answer

Well, what intuition can there be? For a bivariate normal distribution (for $X$ and $Y$, say), uncorrelated means independence of $X$ and $Y$, while for the quite similar bivariate t distribution, with say 100 degrees of freedom, independence do not follow from correlation zero. Plotted this two distributions will look quite similar. For both distributions all contours of the joint density function are elliptical.

The only intuition that I can see is algebraic, the joint density for the bivariate normal is a constant times an exponential function. The argument of the exponential function is a quadratic polynomial in $x,y$. When the correlation is zero, this polynomial will not include any cross terms $xy$, only pure quadratic terms $x^2, y^2$. Then the property of the exponential function that $$ \exp(-x^2 -y^2) = \exp(-x^2)\cdot \exp(-y^2) $$ kicks in (of course the actual terms will be more complicated, but that does not change the idea). If you try to do the same with the bivariate t distribution, everything is the same, except---the quadatic polynomial sits inside the argument of another function without that nice separation property of the exponential! Thats the only intuition that I can see.