Conditional variance: How did the author get from $\text{Var}(Y | X) = E((Y – E(Y | X))^2 | X)$ to $\text{Var}(Y | X) = E(Y^2 | X) – (E(Y | X))^2$

conditional probabilityexpected valueprobabilityvariance

My textbook, Introduction to Probability by Blitzstein and Hwang, says the following in a section on conditional variance:

Definition 9.5.1 (Conditional variance). The conditional variance of $Y$ given $X$ is

$$\text{Var}(Y | X) = E((Y – E(Y | X))^2 | X).$$

This is equivalent to

$$\text{Var}(Y | X) = E(Y^2 | X) – (E(Y | X))^2.$$

I now attempt to expand $E((Y – E(Y | X))^2 | X)$:

$$\begin{align} E((Y – E(Y | X))^2 | X) &= E((Y^2 – 2YE(Y | X) + E(Y | X)^2) | X) \\ &= E((Y^2 | X – 2YE(Y | X) + E(Y | X)^2) \end{align}$$

So I have two questions about this:

  1. In the above, I assumed that $|X$ is distributive; is this valid?

  2. I wasn't completely sure of how to use the linearity property of expected values here; specifically, for the $- 2YE(Y | X)$ term of the expression. How is it correctly done? Naively, I would have proceeded as follows: $E((Y^2 | X – 2YE(Y | X) + E(Y | X)^2) = E(Y^2 | X) – 2 E(YE(Y | X)) + E(Y | X)^2$ (since the expected value of an expected value is just the expected value); does this seem correct, or have am I doing something incorrectly (if so, then what is my misunderstanding)?

The Wikipedia page for conditional variance has the following:

$$\begin{align}
\operatorname{E}[ (Y-f(X))^2 ]
&= \operatorname{E}[ (Y-\operatorname{E}(Y|X)\,\,+\,\, \operatorname{E}(Y|X)-f(X) )^2 ] \\
&= \operatorname{E}[ \operatorname{E}\{ (Y-\operatorname{E}(Y|X)\,\,+\,\, \operatorname{E}(Y|X)-f(X) )^2|X\} ] \\
&= \operatorname{E}[\operatorname{Var}( Y| X )] + \operatorname{E}[(\operatorname{E}(Y|X)-f(X))^2]\,.
\end{align}$$

But this seems different to what was presented in the textbook, so I'm also struggling to see how this result was obtained, and how it relates to the one in the textbook.

So, I guess the third question would be:

  1. How did the author get from $\text{Var}(Y | X) = E((Y – E(Y | X))^2 | X)$ to $\text{Var}(Y | X) = E(Y^2 | X) – (E(Y | X))^2$?

I would greatly appreciate it if people could please take the time to clarify this.

Best Answer

Expanding the square $$ E((Y-E(Y|X))^2|X) = E(Y^2 - 2E(Y|X)Y + E(Y|X)^2|X), $$ Now you have $E(E(Y|X)^2|X) = E(Y|X)^2$ as $E(Y|X)$ is measurable w.r.t. the sigma algebra generated by $X$, and so is $E(Y|X)^2$. Moreover $E(ZY|X) = ZE(Y|X)$ if $Z$ is measurable w.r.t. the sigma algebra generated by $X$, and if you use this with $Z = E(Y|X)$ you get $E(E(Y|X)Y|X) = E(Y|X)^2$. Therefore, by linearity of conditional expectations, $$ E((Y-E(Y|X))^2|X) = E(Y^2|X) - 2E(Y|X)^2 + E(Y|X)^2 = E(Y^2|X) - E(Y|X)^2 $$

Related Question