[Math] Why is expectation defined by $\int xf(x)dx$

lebesgue-integralmeasure-theoryprobability theory

I recently found out that the expectation of a random variable $X$ in a probability space $(\Omega, \mathcal F, \mathbb P)$, $\mathbb E(X)$, is just the term used in probability theory for the measure of the function $X$; i.e.:

$$
\mathbb E(X)=\mathbb P(X)
$$

or, for a real-valued random variable $X:\Omega\to\mathbb R$:

$$
\mathbb E(X)=\int_\Omega X(x)\mathbb P(dx)
$$

The thing is, I am also familiar with the definition of the expectation of a random variable $X:\Omega\to\mathbb R$ with probability distribution function $f$ to be:

$$
\mathbb E (X)=\int_\mathbb R xf(x)dx
$$

Here, the integral is with respect to the Lebesgue measure on $\mathbb R$.

We can define the function $f$ in the following way: we have $F(x)=\int_\mathbb R f(x)dx$, where $F(x)=\mathbb P(X^{-1}((-\infty,x]))$.

I can't see any way to show that these two definitions are equal to one another. The only transformation between integrals with respect to different measures that I know is that $\mu(f^{-1}(g))=\mu(g\circ f)$, which gives that $\mathbb E(g)=\mu_X(g)$, where $\mu_X$ is the image measure $\mathbb P\circ X^{-1}$, but that doesn't seem to help.

Why can we write $\mathbb E(X)=\int_\mathbb R xf(x)dx$? Why are these two definitions consistent?

Best Answer

Let $(\Omega,\mathcal{F},P)$ be a probability space and $X:\Omega\to\mathbb{R}$ a random variable, i.e. a $(\mathcal{F},\mathcal{B}(\mathbb{R}))$-measurable mapping. Then $X$ induces a probability measure on $(\mathbb{R},\mathcal{B}(\mathbb{R}))$ defined by $$ P_X(B):=P(X^{-1}(B)),\quad B\in\mathcal{B}(\mathbb{R}), $$ which is well-defined since $X$ is measurable. This is called the distribution of $X$ or the pushforward measure of $P$ under $X$. The definition of the expectation of $X$ is the following Lebesgue integral on $\Omega$: $$ {\rm E}[X]:=\int_{\Omega} X\,\mathrm dP=\int_\Omega X(\omega)\,P(\mathrm d\omega), $$ given that this integral exists. This integral can always be transformed into a Lebesgue integral on $(\mathbb{R},\mathcal{B}(\mathbb{R}))$. The following holds:

For any integrable random variable $X$ one has $$ {\rm E}[X]=\int_{\mathbb{R}} x\,P_X(\mathrm dx).\tag{1} $$

In the special case where $X$ admits a density function, i.e. $P_X(B)=P(X\in B)=\int_B f_X(x)\,\mathrm dx$ for all $B\in\mathcal{B}(\mathbb{R})$ and for some measurable, non-negative function $f_X$, we can simplify $(1)$ even further: $$ {\rm E}[X]=\int_{\mathbb{R}}xf_X(x)\,\mathrm dx. \tag{2} $$

A standard technique for showing the results in $(1)$ and $(2)$ is to a) show that it holds for indicator functions, i.e. $X=\mathbf{1}_A$ for $A\in\mathcal{F}$, b) show that if it holds for $X$ and $Y$ then it also holds for $\alpha X+Y$, $\alpha\in\mathbb{R}$, and c) if it holds for a sequence $(X_n)$ then it also holds for $\lim X_n$.

Related Question