Expected Value – Expressing Expectation of Joint Distribution Over Discrete and Continuous Variables

expected valueintegraljoint distributionmeasure-theory

Let $Y$ be a discrete random variable and let $X$ be an (absolutely) continuous random variable and $f(X, Y)$ a function of these two random variables. Let $P(X, Y)$ be the joint probability measure. I am now wondering how to properly write the joint expectation value
$\mathbb{E}[f(X, Y)]$? I would write something like:
$$\mathbb{E}[f(X, Y)] = \sum_{y} \int f(x, y) dP(X=x, Y=y),$$
but I think this cannot be quite right because I shouldnt integrate over y.
So I would like to know how to write it down, both in terms of a Lebesgue integral, i.e., with respect to a probability measure $dP$, and as a Riemann Integral where I would integrate with respect to $dx$. Do I somehow have to split the probability measure up into a conditional and a marginal one?

Best Answer

This can be thought of as a companion answer to my answer to your related question about expectations with respect to joint distributions.

Suppose $X$ and $Y$ are real-valued random variables defined on a probability space $(\Omega, \mathcal{A}, \mathbb{P})$, with $X$ absolutely continuous with respect to Lebesgue measure and $Y$ discrete. Let $\mathbb{P}_{X, Y}$ be their joint distribution.

Then the general formula for the expectation of $f(X, Y)$ will be $$ \mathbb{E}[f(X, Y)] = \int_{\mathbb{R} \times \mathbb{R}} f(x, y) \, \mathbb{P}_{X, Y}(d(x, y)) $$ by either the law of the unconscious statistician or the change of variables formula for pushforward measures, or however else you want to call it. This formula uses neither of the assumptions on $X$ and $Y$ (it only assumes that the expectation exists).

Alternatively, the law of iterated expectation, as mentioned in the other answer, can be used to yield either $$ \mathbb{E}[f(X, Y)] = \mathbb{E}[\mathbb{E}[f(X, Y) \mid X]] $$ or $$ \mathbb{E}[f(X, Y)] = \mathbb{E}[\mathbb{E}[f(X, Y) \mid Y]]. $$ Again, this does not use the assumptions on $X$ and $Y$, only the existence of the expectation. In practice, one of these two forms might be easier to compute than the other, so it's up to you to choose which one to use.

In both cases, you'll probably want to know a conditional distribution of one of the variables with respect to the other, which I'll go over next.

  1. Suppose $\mathbb{P}_{X \mid Y}$ is a conditional distribution of $X$ given $Y$. Then we could write $$ \begin{aligned} \mathbb{E}[f(X, Y)] &= \sum_{y \in \mathbb{R}} \mathbb{E}[f(X, Y) \mid Y = y] \mathbb{P}(Y = y) \\ &= \sum_{y \in \mathbb{R}} \left(\int_{\mathbb{R}} f(x, y) \, \mathbb{P}_{X \mid Y}(dx, y)\right) \mathbb{P}(Y = y). \end{aligned} $$ The question then becomes how to compute $\mathbb{P}_{X \mid Y}$, and this depends on what you know about $X$ and $Y$ to begin with. However, one potential starting point is that this conditional distribution is determined by the condition $$ \mathbb{P}(X \in B, Y \in C) = \sum_{y \in C} \left(\int_B \, \mathbb{P}_{X \mid Y}(dx, y)\right) \mathbb{P}(Y = y) $$ for all Borel sets $B, C \subseteq \mathbb{R}$.

    It might be the case that you can compute a conditional density $p_{X \mid Y}$ of $X$ given $Y$ with respect to Lebesgue measure, in which case we would have $$ \int_B \mathbb{P}_{X \mid Y}(dx, y) = \int_B p_{X \mid Y}(x, y) \, dx, $$ and hence $$ \mathbb{E}[f(X, Y)] = \sum_{y \in \mathbb{R}} \left(\int_{\mathbb{R}} f(x, y) p_{X \mid Y}(x, y) \, dx\right) \mathbb{P}(Y = y). $$

  2. Now suppose $\mathbb{P}_{Y \mid X}$ is a conditional distribution of $Y$ given $X$, and $p_X$ is the density of $X$ with respect to Lebesgue measure. In this case, $$ \mathbb{E}[f(X, Y)] = \int_{\mathbb{R}} \left(\int_{\mathbb{R}} f(x, y) \, \mathbb{P}_{Y \mid X}(d y, x)\right) p_X(x) \, dx. $$ Again, being able to compute $\mathbb{P}_{Y \mid X}$ requires you to know something about $X$ and $Y$ beforehand, but it is determined by the condition $$ \mathbb{P}(X \in B, Y \in C) = \int_B \left(\int_C \, \mathbb{P}_{Y \mid X}(dy, x)\right) p_X(x) \, dx $$ for all Borel sets $B, C \subseteq \mathbb{R}$. In this case, you can compute a conditional probability mass function $p_{Y \mid X}$ of $Y$ given $X$ (i.e., a conditional density with respect to counting measure) explictly by $$ p_{Y \mid X}(y, x) = \mathbb{P}_{Y \mid X} (\{y\}, x) = \text{"}\mathbb{P}(Y = y \mid X = x)\text{"}. $$ and hence $$ \mathbb{E}[f(X, Y)] = \int_{\mathbb{R}} \left(\sum_{y \in \mathbb{R}} f(x, y) \, p_{Y \mid X}(y, x)\right) p_X(x) \, dx. $$

Related Question