[Math] Conditional expectation conditioned on multiple random variables

conditional-expectationprobabilityprobability theory

I came across the following expression:

$E[X \mid Y=y, Z_1, Z_2] = \frac{\sum_x x P(X=x, Y=y \mid Z_1,Z_2)} { P(Y=y \mid Z_1,Z_2)}$

But I have no idea how this simplification works because I am not sure how to deal with conditional expectation when its conditioned on several random variables like "$Y=y,Z_1,Z_2$" as above. Is it possible to just move the $Y=y$ over to the other side like that in a conditional expectation when you have several variables? Can someone please explain the inner workings?

Best Answer

What you have to recognize here is that both sides are random variables. They are functions. For example, the right hand side is a function, defined as : $E[X | Y = y,Z_1,Z_2](z_1,z_2) = E[X | Y = y,Z_1 = z_1, Z_2 = z_2]$.

So how do you check if two random variables are equal? Of course, by checking if they are equal at each point. So fix a point $(z_1,z_2)$ at which both sides are being evaluated.

The left hand side is $E[X | Y = y, Z_1 = z_1 , Z_2 = z_2]$, which can be written as : $$ \sum_{x} xP(X = x | Y = y , Z_1 = z_1, Z_2 = z_2) $$

This expands to, by the usual rule : $$ \sum_{x} \frac{xP(x=X,Y=y,Z_1 = z_1, Z_2 = z_2)}{P(Y = y , Z_1 = z_1,Z_2 = z_2)} $$

Multiply and divide by $P(Z_1 = z_1,Z_2 = z_2)$ : $$ \sum_{x} x\frac{P(X=x,Y=y,Z_1 = z_1,Z_2 =z_2) / P(Z_1 = z_1, Z_2 = z_2)}{P(Y= y, Z_1 = z_1,Z_2 = z_2)/P(Z_1 = z_1,Z_2 = z_2)} $$

which simplifies via conditional expectation to : $$ \sum_{x} x \frac{P(X= x, Y = y|Z_1 = z_1, Z_2 = z_2)}{P(Y = y | Z_1 = z_1, Z_2 = z_2)} $$

which is nothing but by definition : $$ \left[\sum_{x} x\frac{P(X=x,Y=y|Z_1,Z_2)}{P(Y=y,Z_1,Z_2)}\right] (z_1,z_2) $$

because the application of the function is the same as what happens when $Z_1 = z_1$ and $Z_2 = z_2$.

Since these functions are equal at all points, they are equal,hence the result follows.


To deal then, with multiple variables, you need to recognize whether you are dealing with an equality of numbers or of functions, like here. Imagine you are setting values for all the conditioned variables, and then you can push those in the right of the $|$ to the left of the $|$ like how I did with the $Y$ variable : by dividing and multiplying by an appropriate term. Finally, whatever is left reflects a functional equality, as has happened here.

This trick will of course have a different version once you deal with conditioning over sigma-algebras, like in more rigorous probability.