Below is a problem I just did. My question for MSE is not how to solve it – but I provide it to illustrate what exactly I am asking.
Suppose X,Y are bivariate normal random variables with $E[X] = 40$, $\mathrm{Var}(X) = 76$, $E[Y] = 30$, $\mathrm{Var}(Y) = 32$, and
$\mathrm{Var}(X | Y = 28.5) = 57.$Calculate $\mathrm{Var}(Y | X = 25)$.
Although I know very little about bivariate random variables, I was able to solve this problem because I have a formula:
$$\mathrm{Var}(Y | X = x) = \sigma_{Y}^2(1 – \rho^2).$$
I am not certain, but based on convention I assume $\rho$ = $\rho_{X,Y}$ = $\frac{\mathrm{Cov}(X,Y)}{\sigma_X \sigma_Y}$.
Looking at the information given and my formula, I saw I could use the second formula to solve for $\rho$, and then re-use the formula to find the desired value. This is when I realized – the question in no way depends on the values of the conditioning variables ($Y = 28.5, X=25)$. This seemed strange to me. Keep in mind, my solution is just number crunching for me, I don't have a lot of background knowledge to provide intuition.
Can someone explain to me how this is intuitive that the function $f(x) =
\mathrm{Var}(Y | X = x)$ is a constant function?
In my head when I picture a bivariate normal distribution I see what looks like an ant-hill centered over (0,0) in $\mathbb{R}^2$ (yes, technically I'm picturing a standard-bivariate normal). But then if I consider the cross sections cut out by fixing values of $X$, it seems the ones closer to the origin have a bigger hump – thus less variance? Is each cross section for different values of $X$ actually just like.. a scaling of the others? Thus variance stays fixed? Was this intentional in the construction of bivariate normals?
Best Answer
A way of seeing this is to consider how to generate pairs of random values from a bivariate Normal distribution with $X \sim N(\mu_X,\sigma^2_X)$ and $Y \sim N(\mu_Y,\sigma^2_Y)$ and covariance $\sigma_{X,Y}=\rho \sigma_{X}\sigma_{Y}$ between $X$ and $Y$.
One approach is to calculate the parts of $Y$ that depend on $X$ and do not depend on $X$ separately, and then add them together, which you can do as this is a bivariate Normal. The conditional variance of $Y$ given $X=x$ is then just the variance of the part of $Y$ that does not depend on $X$, and naturally this is not affected by the particular value $x$ that $X$ takes.
As a working algorithm:
This means $\mathrm{Var}(Y \mid X = x) = \mathrm{Var}(Z)=(1- \rho^2)\sigma_{Y}^2$, which does not depend on the value of $x$