[Math] Conditional expectation with normal distribution – Clarification needed

conditional probabilityprobability

I am trying to figure out conditional expectation for the following case: Suppose $\theta$ has normal distribution with mean $0$ and variance $1$ i.e., standard normal. Let $x_i=\theta+\epsilon_i$ where $\epsilon_i$ has normal distribution with mean $0$ and variance $\frac{1-v}{v}$ for some $v\in(0,1)$. Given this it is claimed $\mathbf{E}[\theta\mid x_i]=vx_i$. To my mind, since $\theta=x_i-\epsilon_i$, thus $\mathbf E[\theta\mid x_i]=x_i$. To obtain answer I tried proper way which is using pdf of normal and expectation but I couldn't figure out how to solve integrals. Thanks for any help!

P.S. After a comment, $\theta$ and $\epsilon_i$ independent.

Best Answer

Sketch: Let $X=Y+Z$ with $Y,Z$ normal and independent: $Y \sim N(0,1) $ and $Z \sim N(0,a) $ We want $E[Y|X]$.

A common mistake (that you did) is to write $E[Y|X]=E[X|X]-E[Z|X]=X$ . The problem is that $X$ and $Z$ are not independent, hence $E[Z|X] \ne 0$ [*]

One way is to apply the inversion formula to the densities. Informally:

$$P(Y|X) = \frac{P(X|Y)P(Y)}{P(X)}=\frac{\psi(X;Y,a) \,\psi(Y;0,1)}{\psi(X;0,a+1)}$$ where $\psi(t;u,s)$ is the normal density function with mean $u$ and variance $s$ evaluated at $t$. Then, assuming $X$ is fixed, we can ignore the denominator (it's a normalization factor) and the numerator is the product of normals; working out the exponent (remember that $Y$ is our variable here) when can complete the square...

$$-\frac{(X-Y)^2}{2a}-\frac{Y^2}{2}=-\frac{1}{2a}\left((1+a)Y^2-2XY+X^2\right)=\\ =-\frac{1+a}{2a}\left( Y- \frac{X}{1+a} \right)^2 + c$$ with some $c$ does not depend on $Y$ (no need to compute it).

Hence, we get that $Y$ conditioned on $X$ is normal with mean $\frac{X}{1+a}$ (you also can see the variance), so $$E[Y|X]=\frac{X}{1+a}$$

[*] Why this mistake is common? Perhaps because of a wrong use of cause-effect concepts. Given $C=A+B$ one regards $C$ as an "ouput" and $A,B$ as "inputs" (ok), then slip on to consider of $A,B$ as "causes" and $C$ as the "effect", then one considers that "the output (effect) depends on the input (cause)", but on the contrary, "the input does not depend on the output" (false; for one thing, independence is a symmetric relation). Hence to condition on the output does not changes the probabilities of the inputs, and $E(A|C)=E(A)$. Of course, this is wrong.