Prove the existence of conditional expectations in $\mathcal{L}^2$

conditional-expectationhilbert-spacesprobability theory

Let $X \in \mathcal{L}^2(\Omega,\mathcal{A},P)$ be a square integrable real random variable. Then for any sub-$\sigma$-algebra $ \mathcal{C}$ of $\mathcal{A}$ and any $ \mathcal{C}$ measurable $Y \in \mathcal{L}^2(\Omega,\mathcal{A},P)$ we have

$$E[(X-X_0)^2]\leq E[(X-Y)^2]$$

with equality holding if and only if $Y=X_0$ a.s.. Here $X_0$ is a version of the conditional expectation $E[X|\mathcal{C}]$.

The exercise asks to use this result to show the existence of $E[X|\mathcal{C}]$ for any $X \in \mathcal{L}^2(\Omega,\mathcal{A},P)$ using the theory of Hilbert spaces.

My argument is

Pass over to the quotient space $L_2:=\mathcal{L}^2(\Omega,\mathcal{A},P) / \mathcal{N}(\Omega,\mathcal{C},P_{\mathcal{C}}) $ where

$$\mathcal{N}(\Omega,\mathcal{C},P_{\mathcal{C}}):=\{X\in \mathcal{L}^2(\Omega,\mathcal{C},P_{\mathcal{C}}):X=0 \text{ }P_{\mathcal{C}}\text{-a.s.} \} \subset \mathcal{L}^2(\Omega,\mathcal{A},P) $$
and $P_{\mathcal{C}}$ denotes the restriction of $P$ to $\mathcal{C}$. This is a Hilbert space with respect to the norm induced by the following inner product:

$$\langle \tilde{X},\tilde{Y} \rangle \mapsto \int XYdP$$

where $X$,$Y \in \mathcal{L}^2(\Omega,\mathcal{A},P)$ are arbitrary representatives of the classes $\tilde{X}$,$\tilde{Y} \in L_2$ respectively.

Let $L^{\mathcal{C}}_2:=\mathcal{L}^2(\Omega,\mathcal{C},P_{\mathcal{C}}) / \mathcal{N}(\Omega,\mathcal{C},P_{\mathcal{C}})$. Since $\mathcal{L}^2(\Omega,\mathcal{C},P_{\mathcal{C}}) \subset \mathcal{L}^2(\Omega,\mathcal{A},P)$, every equivalent class of $L^{\mathcal{C}}_2$ is an equivalent class of $L^2$, and so $L^{\mathcal{C}}_2$ is subspace of $L^2$.

Now, $L^{\mathcal{C}}_2$ is a Hilbert space in its own right. It is complete, and in particular closed. Hence given $X\in \mathcal{L}^2(\Omega,\mathcal{A},P)$ we can apply the projection theorem to an arbitrary $\tilde{X} \in L^2$ and obtain a unique vector $\tilde{X}_0 \in L^{\mathcal{C}}_2$ such that

$$ || \tilde{X} – \tilde{X}_0 || \leq || \tilde{X} – \tilde{Y} || $$

for all $\tilde{Y} \in L^{\mathcal{C}}_2$, with equality holding if and only if $\tilde{Y}=\tilde{X}_0$.

Translating back into expectations and using the result above we see that each $X_0 \in \tilde{X}_0$ is a version of $E[X|\mathcal{C}]$. Conversely if $Z_0$ is a version $E[X|\mathcal{C}]$, then we see that we must have $\tilde{Z}_0=\tilde{X}_0$.

Hence $\tilde{X}_0$ is precisely the class of all conditional expectation versions of $X$.

Am I missing something? My definition of $L^2$ is not the usual one, since I use $\mathcal{N}(\Omega,\mathcal{C},P_{\mathcal{C}})$ instead of $\mathcal{N}(\Omega,\mathcal{A},P)$. I feel like this is necessary.

Best Answer

As noted by Ian, my definition of $L^2$ is problematic because the inner product $\langle \tilde{X},\tilde{Y} \rangle \mapsto \int XYdP$ is not positive definite. This is because if $X=0$ $P$-a.s. and $X$ is not $\mathcal{C}$ measurable, then $\tilde{X}\neq\tilde{0}$ but $\langle \tilde{X},\tilde{X} \rangle =0$.

We can alternatively proceed thus:

Define $$L_2:=\mathcal{L}^2(\Omega,\mathcal{A},P) / \mathcal{N}(\Omega,\mathcal{A},P)$$

$$L^{\mathcal{C}}_2:=\mathcal{L}^2(\Omega,\mathcal{C},P_{\mathcal{C}}) / \mathcal{N}(\Omega,\mathcal{C},P_{\mathcal{C}})$$

with the same notation as before. Then both $L_2$ and $L^{\mathcal{C}}_2$ are Hilbert spaces with the usual inner product.

Since $\mathcal{L}^2(\Omega,\mathcal{C},P_{\mathcal{C}}) \subset \mathcal{L}^2(\Omega,\mathcal{A},P)$ and $\mathcal{N}(\Omega,\mathcal{C},P_{\mathcal{C}}) \subset \mathcal{N}(\Omega,\mathcal{A},P)$, we see that each equivalent class of $L^{\mathcal{C}}_2$ is included in some equivalent class of $L^2$. Moreover, if $X,Y\in \mathcal{L}^2(\Omega,\mathcal{C},P_{\mathcal{C}})$ belong to the same equivalent class in $L^2$, then $X=Y$ $P$-a.s. (hence also $P_{\mathcal{C}}$-a.s.), and so $X$ and $Y$ are from the same equivalent class in $L^{\mathcal{C}}_2$.

The previous argument shows that there exist an linear injection $I:L^2_{\mathcal{C}}\to L^2$. Because only null sets are involved, we see that

$$||\tilde{X}||_{L^{\mathcal{C}}_2}=||I(\tilde{X})||_{L_2} \hspace{1cm} \text{and} \hspace{1cm} \langle \tilde{X},\tilde{Y} \rangle_{L^{\mathcal{C}}_2}= \langle I(\tilde{X}),I(\tilde{Y}) \rangle_{L^{}_2}$$

for all $\tilde{X},\tilde{Y} \in L^{\mathcal{C}}_2$. Hence from the completeness of $L^2_{\mathcal{C}}$ follows the completeness of $I(L^{\mathcal{C}}_2)$, i.e. $I(L^{\mathcal{C}}_2)$ is a Hilbert subspace of $L^2$.

Now the argument can proceed as before with the conclusions unchanged. For example the projection theorem now reads

For every $\tilde{X} \in L^2$ and there is a unique vector $\tilde{X}_0 \in I(L^{\mathcal{C}}_2)$ such that

$$ || \tilde{X} - \tilde{X}_0 ||_{L^2} =\min_{\tilde{Y} \in I(L^{\mathcal{C}}_2)} || \tilde{X} - \tilde{Y} ||_{L^2} $$

which, on account of the injectivity of $I$, is translated as

For every $\tilde{X} \in L^2$ and there is a unique vector $\tilde{X}_0 \in L^{\mathcal{C}}_2$ such that

$$ || \tilde{X} - \tilde{X}_0 ||_{L^2} =\min_{\tilde{Y} \in L^{\mathcal{C}}_2} || \tilde{X} - \tilde{Y} ||_{L^2} $$

Related Question