[Math] Prove tower property of conditional expectation $\mathbb{E}[\mathbb{E}[X|Y,W]| Y = y] = \mathbb{E}[X|Y = y]$

conditional-expectationprobability theory

I am using Sheldon Ross 9th edition as a reference and all I am given is the definition of conditional expectation:

$$E[X|Y = y] = \sum\limits_x x \Pr[X = x| Y = y]$$

I am required to prove the tower property of conditional expectation:

$$\mathbb{E}[\mathbb{E}[X|Y,W]| Y = y] = \mathbb{E}[X|Y = y]$$

I am very lost and need assistance. First, we recognize that $\mathbb{E}[X|Y,W]$ is a random variable that is a function of $Y,W$

We first write:

$$\mathbb{E}[X|Y=y,W=w] = \sum\limits_x x \Pr[X = x| Y = y,W = w],$$

then,

\begin{align*}\mathbb{E}[\mathbb{E}[X|Y,W]| Y = y] &= \mathbb{E}[\sum\limits_x x \Pr[X = x| Y = y,W = w]|Y = y]\\ &= \sum\limits_y \sum\limits_w \sum\limits_x x \Pr[X = x| Y = y,W = w] yw\Pr[Y = y,W = w|Y = y]\\ & = \sum\limits_y \sum\limits_w \sum\limits_x x \Pr[X = x| Y = y,W = w] yw\frac{\Pr[Y = y,W = w]}{\Pr[Y=y]} \\ &= \sum\limits_y \sum\limits_w \sum\limits_x x yw\frac{\Pr[X = x,Y=y,W=w]}{\Pr[Y=y]} \end{align*}

The problem is at this stem, all I get is: $\dfrac{1}{\Pr[Y=y]}$.

I strongly suspect:

\begin{align*}\mathbb{E}[\mathbb{E}[X|Y,W]| Y = y] = \sum\limits_y \sum\limits_w \sum\limits_x x \Pr[X = x| Y = y,W = w] yw\Pr[Y = y,W = w|Y = y] \end{align*} is incorrect. But it makes intuitive sense to me. We are taking the expectation of two jointly defined random variables $Y,W$, so the probability is $\Pr[Y = y,W = w|Y = y]$.

Can someone please point me to the right direction?

Best Answer

I've adapted the following definition of $\mathbb E [X \vert W, Y]$ from the discussion in Probability with Martingales by Williams, section 9.6.

Define the function $g$ by

$$g (w,y) = \mathbb E[X \vert W = w, Y = y]. $$

We define $\mathbb E [X \vert W,Y]$ to be the random variable $g(W,Y)$. Note this is different from the expectation of $X$ conditional on the event $\{W=W, Y=Y\}$, which is clearly the unconditional expectation of $X$.

We can thus make the following calculation:

\begin{align*} \mathbb E [ \mathbb E [X \vert W,Y] \vert Y= y ] &= \mathbb E [g(W,Y) \vert Y = y] \\ &= \sum_{w,y^\prime} g\left(w,y^\prime \right) \Pr \left[\left.W = w, Y = y^\prime \right\vert Y = y\right] \\ &= \sum_w g(w,y) \Pr [W = w \vert Y = y] \\ &= \sum_w \mathbb E [X \vert W=w,Y=y] \Pr [W = w \vert Y = y] \\ &= \sum_w \left[ \sum_x x \Pr [X=x \vert W=w,Y=y] \right]\Pr [W = w \vert Y = y] \\ &= \sum_{w,x} x \Pr[X=x, W=w \vert Y = y] \\ &= \sum_x x \Pr[X = x \vert Y = y] \\ &= \mathbb E[X \vert Y=y] \end{align*}

The third equality follows from the fact that $\Pr \left[\left.W = w, Y = y^\prime \right\vert Y = y\right]$ is $\Pr[W = w \vert Y= y]$ for $y^\prime = y$ and $0$ otherwise. The sixth equality comes from Bayes' rule. The rest should follow from definitions.