In here and here, it appears to be proven that under the following conditions:

1) $Y$ follows a normal distribution $N(\mu_Y, \sigma_Y^2)$, (2) $X$ follows a normal distribution $N(\mu_X, \sigma_X^2)$ (Note: $X$ and $Y$ are not necessarily independent), and (3) $E(Y|x)$, the conditional mean of $Y$ given $x$ is linear in $x$, then:

$$E(Y|x)=\mu_Y+\rho \frac{\sigma_Y}{\sigma_X}(x−\mu_X) \cdots (*)$$

But no proof is given, can anyone show me how this is proven? Also note, that there is no mention of whether $X$ and $Y$ are bivariate normal, if they were then the above result is easy to show.

Put simply, my question is: Is is true that if 1) $Y \sim N(\mu_Y, \sigma_Y^2)$, 2) $X \sim N(\mu_X, \sigma_X^2)$, 3) Correlation between $X$ and $Y$ is $\rho$, 4) $E(Y|x)$ is linear in $x$, then $(*)$ holds?

If true, how does one go about proving it? I have never seen this result anywhere, I have only seen the expression $(*)$ appear when $(X, Y)$ are bivariate normal as it is simply the mean of the conditional distribution of $Y$ given $X$.

Best Answer

From the assumption, we can let $E[Y|X] = aX + b$, where $a, b$ are constants yet to be determined. Taking expectation on both sides, we have

$$ E[E[Y|X]] = aE[X] + b \Rightarrow \mu_Y = a\mu_X + b$$

On the other hand consider the covariance, by definition

$$ Cov[X, Y] = \rho\sigma_X\sigma_Y $$

By the law of total covariance, we have $$ Cov[X, Y] = E[Cov[X, Y|X]] + Cov[E[X|X], E[Y|X]] = 0 + Cov[X,aX+b] = a\sigma_X^2 $$

Equating the above two equations, we have $$ a\sigma_X^2 = \rho\sigma_X\sigma_Y \Rightarrow a = \rho\frac {\sigma_Y} {\sigma_X}$$

and putting all the results together

$$ E[Y|X] = aX + b = aX + \mu_Y - a\mu_X = \mu_Y + a(X - \mu_X) = \mu_Y + \rho\frac {\sigma_Y} {\sigma_X}(X - \mu_X) $$

Note that here we do not need any of the normality assumptions. So this results holds in general, and in particular you should see similar result in linear regression too.

