I am using Sheldon Ross 9th edition as a reference and all I am given is the definition of conditional expectation:
$$E[X|Y = y] = \sum\limits_x x \Pr[X = x| Y = y]$$
I am required to prove the tower property of conditional expectation:
$$\mathbb{E}[\mathbb{E}[X|Y,W]| Y = y] = \mathbb{E}[X|Y = y]$$
I am very lost and need assistance. First, we recognize that $\mathbb{E}[X|Y,W]$ is a random variable that is a function of $Y,W$
We first write:
$$\mathbb{E}[X|Y=y,W=w] = \sum\limits_x x \Pr[X = x| Y = y,W = w],$$
then,
\begin{align*}\mathbb{E}[\mathbb{E}[X|Y,W]| Y = y] &= \mathbb{E}[\sum\limits_x x \Pr[X = x| Y = y,W = w]|Y = y]\\ &= \sum\limits_y \sum\limits_w \sum\limits_x x \Pr[X = x| Y = y,W = w] yw\Pr[Y = y,W = w|Y = y]\\ & = \sum\limits_y \sum\limits_w \sum\limits_x x \Pr[X = x| Y = y,W = w] yw\frac{\Pr[Y = y,W = w]}{\Pr[Y=y]} \\ &= \sum\limits_y \sum\limits_w \sum\limits_x x yw\frac{\Pr[X = x,Y=y,W=w]}{\Pr[Y=y]} \end{align*}
The problem is at this stem, all I get is: $\dfrac{1}{\Pr[Y=y]}$.
I strongly suspect:
\begin{align*}\mathbb{E}[\mathbb{E}[X|Y,W]| Y = y] = \sum\limits_y \sum\limits_w \sum\limits_x x \Pr[X = x| Y = y,W = w] yw\Pr[Y = y,W = w|Y = y] \end{align*} is incorrect. But it makes intuitive sense to me. We are taking the expectation of two jointly defined random variables $Y,W$, so the probability is $\Pr[Y = y,W = w|Y = y]$.
Can someone please point me to the right direction?
Best Answer
I've adapted the following definition of $\mathbb E [X \vert W, Y]$ from the discussion in Probability with Martingales by Williams, section 9.6.
Define the function $g$ by
$$g (w,y) = \mathbb E[X \vert W = w, Y = y]. $$
We define $\mathbb E [X \vert W,Y]$ to be the random variable $g(W,Y)$. Note this is different from the expectation of $X$ conditional on the event $\{W=W, Y=Y\}$, which is clearly the unconditional expectation of $X$.
We can thus make the following calculation:
\begin{align*} \mathbb E [ \mathbb E [X \vert W,Y] \vert Y= y ] &= \mathbb E [g(W,Y) \vert Y = y] \\ &= \sum_{w,y^\prime} g\left(w,y^\prime \right) \Pr \left[\left.W = w, Y = y^\prime \right\vert Y = y\right] \\ &= \sum_w g(w,y) \Pr [W = w \vert Y = y] \\ &= \sum_w \mathbb E [X \vert W=w,Y=y] \Pr [W = w \vert Y = y] \\ &= \sum_w \left[ \sum_x x \Pr [X=x \vert W=w,Y=y] \right]\Pr [W = w \vert Y = y] \\ &= \sum_{w,x} x \Pr[X=x, W=w \vert Y = y] \\ &= \sum_x x \Pr[X = x \vert Y = y] \\ &= \mathbb E[X \vert Y=y] \end{align*}
The third equality follows from the fact that $\Pr \left[\left.W = w, Y = y^\prime \right\vert Y = y\right]$ is $\Pr[W = w \vert Y= y]$ for $y^\prime = y$ and $0$ otherwise. The sixth equality comes from Bayes' rule. The rest should follow from definitions.