[Math] Conditional expectation with respect to a continuous random variable

probability theory

Suppose $X$ is a random point in the Euclidean space $R^3$ with some non-discrete distribution, for example, $X$ is uniform in a bounded area. $U$ is another random point which is uniform on the sphere $B(0,1)$ and independent of $X$, set $Y=X+U$. Set $Z_1=1/|X|$ and $Z_2=1/|Y|$, where $|X|$ is the distance of $X$ to the origin, $|Y|$ similarly. We want to compute the conditional expectation $E[Z_2|Z_1]$.

The computation goes as follows: If we know $Z_1=a$, then $Y$ is a uniform random point on the sphere S, where S has radius 1, and the distance between the center of S and the origin is $1/a$. Use some calculus, one can compute the value of $Z_2$ given the information $Z_1=a$:

$$E[Z_2|\ \{|Z_1|=a\}]=a\quad \text{if} \ a<1$$

and

$$E[Z_2|\ \{|Z_1|=a\}]=1\quad\text{if}\ a\geq 1$$

So it concludes that $E[Z_n|Z_{n-1}]\leq Z_{n-1}$. (supermartingale).

I can not understand why one can compute the conditional expectation with respect to a continuous random variable as above: If we want to compute $E[Z|Y]$ when $Y$ is a continuous r.v. and for every possible value $Y=y$, the conditional expectation of $Z$ given the event $E[Z|\{Y=y\}]$ happens to exist and can be computed (denote as $f(y)$), then one can assert that$$E[Z|Y]=f(Y).$$

I know this way coincides with the intuition of conditional expectation; but I can't see why it agrees with the formal definition of conditional expectation: for every event $A$ in $\sigma(Y)$, $$\int_A f(Y) \, dP=\int_A Z \, dP.$$

Can anyone explain why? Thanks.

Best Answer

It seems the procedure to be explained is based on the independence structure of the random variables one considers. More specifically, $U$ is independent of $X$ and its distribution is rotationally invariant hence $(|X|,|Y|)$ is distributed like $(|X|,||X|e+U|)$, where $e$ is a given unit vector. Here is a classical result:

Let $\zeta$ and $\eta$ denote some independent random variables and $a$ a measurable function such that $a(\zeta,\eta)$ is integrable. Then $\mathrm E(a(\zeta,\eta)\mid\zeta)=b(\zeta)$, where the function $b$ is defined by $b(z)=\mathrm E(a(z,\eta))$ for every $z$.

Using this with $\zeta=1/|X|$, $\eta=U$, and the function $a$ defined on $\mathbb R_+^*\times\mathbb R^3$ by $a(z,y)=1/|z^{-1}e+y|=z/|e+zy|$ for every $(z,y)$, one gets $\mathrm E(Z_2\mid Z_1)=b(Z_1)$ with $b(z)=\mathrm E(a(z,U))$... a formula which seems to yield a rigorous foundation to the procedure you explained.