How are the conditional expectation and conditional independence treated in tower property

conditional-expectationprobability theorystatistical-inference

I was going through a proof of a theorem regarding U-Statistics (Large Sample Theory) and got stuck to figure out a particular step that includes conditional expectation and conditional independence.

Theorem : When $\textrm{n}\geq \textrm{2m-1}$, the variance of a U-Statistic is $$\textrm{Var}_{F}[U_{n}] = \frac{1}{\binom{\textrm{n}}{\textrm{m}}}\sum_{\textrm{k}=1}^{\textrm{m}}\binom{\textrm{m}}{\textrm{k}} \binom{\textrm{n-m}}{\textrm{m-k}}\sigma_{\textrm{k}}^2 \ \ .$$ If $\sigma_{\textrm{1}}^2, \sigma_{\textrm{2}}^2,…, \sigma_{\textrm{m}}^2< \infty$, then $$\textrm{Var}_{F}[U_{n}] = \textrm{m}^2 \sigma_{1}^2\mathit{O}(\frac{1}{\textrm{n}})+\mathit{O}(\frac{1}{\textrm{n}^2}) \ \ .$$

My problem arises when the proof uses conditional expectation, tower rule and conditional independence while determining the $\mathbb{E}(\mathit{XY})$ term of the covariance.

The covariance is –

$$\textit{Cov}_{\ \textrm{F}}[\textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{m}).\textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k},\textrm{Y}_{m+1},…,\textrm{Y}_{m+(m-k)}]$$
$$=\textit{E}_{\ \textrm{F}}[\textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{m}).\textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k},\textrm{Y}_{m+1},…,\textrm{Y}_{m+(m-k)}] – \textit{E}_{\ \textrm{F}}[\textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{m})].\textit{E}_{\ \textrm{F}}[(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k},\textrm{Y}_{m+1},…,\textrm{Y}_{m+(m-k)}] $$

Now, for $1\leq k \leq m$,
$$\textit{E}_{\ \textrm{F}}[\ \textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{m}).\textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k},\textrm{Y}_{m+1},…,\textrm{Y}_{m+(m-k)} \ ] \\ = \textit{E}_{ \textrm{F}}\{\textit{E}_{\ \textrm{F}}[\ \textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{m}).\textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k},\textrm{Y}_{m+1},…,\textrm{Y}_{m+(m-k)}|\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k}\ ]\} \longrightarrow \textrm{(i)} \\ =\textit{E}_{\ \textrm{F}}[\textit{E}_{\ \textrm{F}}[\ \textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{m})|\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k} \ ].\textit{E}_{\ \textrm{F}}[\ \textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k},\textrm{Y}_{m+1},…,\textrm{Y}_{m+(m-k)}|\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k} \ ]] \longrightarrow \textrm{(ii)}\\ =\textit{E}_{\ \textrm{F}}[\textit{g}_{1}|\textrm{Y}].\textit{E}_{\ \textrm{F}}[\textit{g}_{2}|\textrm{Y}] \longrightarrow \textrm{(iii)}$$
where $\textit{g}[(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{m})|\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k}]=[\textit{g}_{1}|\textrm{Y}]$ and $\textit{g}[(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k},\textrm{Y}_{m+1},…,\textrm{Y}_{m+(m-k)})|\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k}]=[\textit{g}_{2}|\textrm{Y}]$

  1. My problem :: I am trying to understand how $\textrm{(i)} \Longrightarrow \textrm{(ii)}$.

The definition of tower rule is

For sub-$\sigma$ algebras, $\mathcal{H}_{1}\subset \mathcal{H}_{2}\subset \mathscr{F}$, we have $\mathbb{E}(\mathbb{E}[\mathit{X}|\mathcal{H}_{2}]|\mathcal{H}_{1}) \\ =\mathbb{E}(\mathbb{E}[\mathit{X}|\mathcal{H}_{1}]|\mathcal{H}_{2}) \\ =\mathbb{E}(\mathit{X}|\mathcal{H}_{1})$.

For random variables $\mathit{X, Y, Z}$, $\mathbb{E}(\mathbb{E}[\mathit{X}|\mathit{Y,Z}]|\mathit{Y}) = \mathbb{E}(\mathit{X}|\mathit{Y})$, which is never equal to $\mathbb{E}(\mathit{X}|\mathit{Y}).\mathbb{E}(\mathit{Z}|\mathit{Y})$

Also, given that $$\textit{g}_{\textrm{k}}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k}) =\textit{E}_{\ \textrm{F}}[\textit{g}(\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{m})|\textrm{Y}_{1}, \textrm{Y}_{2},…, \textrm{Y}_{k}]$$ and $$\textit{g}_{\textrm{k}}(\textrm{y}_{1}, \textrm{y}_{2},…, \textrm{y}_{k}) =\textit{E}_{\ \textrm{F}}\textit{g}[\textrm{y}_{1}, \textrm{y}_{2},…, \textrm{y}_{k},\textrm{Y}_{k+1},…,\textrm{Y}_{m}]$$, where $\textit{g}(.)$ is the kernel function of the expectation functional $\mathbb{E}_{\textit{F}}{\textit{g}(.)}$

  1. My question : Can we say $\mathbb{E}(\mathbb{E}(\mathbb{E}\mathit{X|Y)Z|Y})=\mathbb{E}(\mathit{X|Y}).\mathbb{E}(\mathit{Z|Y})$? Also do $\mathit{X}$ and $\mathit{Z}$ need to be independent?

I am surely missing some fact to figure out how $\textrm{(i)} \Longrightarrow \textrm{(ii)}$ and my second question arises due to my problem. Also, gone through this related post still it is not clear how to overcome my problem. Therefore all comments, explanations, hints and answers are welcome and valuable.

  1. Does there exist any generalized form or version of the tower property?

*Edit : I guess that the notion of conditional independence [$\mathbb{E}(\mathit{XY|Z})=\mathbb{E}(\mathit{X|Z}).\mathbb{E}(\mathit{Y|Z})$] is necessary to assume to determine the $\mathbb{E}(\mathit{XY})$ part and in that case the proof becomes easier.

Best Answer

Write $Y^k = (Y_1, \ldots, Y_k),$ and $A = (Y_{k + 1}, \ldots, Y_m)$ and $B = (Y_{m+1}, \ldots, Y_{m+s})$ (with $s = m-k$ say). You have $E(u(Y^k, A)v(Y^k, B))$ for a pair of functions $u$ and $v.$ Since $A$ and $B$ are independent, we can see that $$ \begin{align*} E(u(Y^k, A) v(Y^k, B)) &= E(E(u(Y^k, A)v(Y^k, B)\mid Y^k)) \\ &= E(E(u(Y^k, A) \mid Y^k)E(v(Y^k, B)\mid Y^k)). \end{align*} $$ (This is so since given $Y^k,$ the random variables $u(Y^k, A)$ and $v(Y^k, B)$ depend solely on independent random variables, viz. $A$ and $B$.)