Probability Theory – Why Use Borel Measurable Random Variables?

measure-theorypr.probability

I've been studying a bit of probability theory lately and noticed that there seems to be a universal agreement that random variables should be defined as Borel measurable functions on the probability space rather than Lebesgue measurable functions. This is so in every textbook on probability theory which I consulted. In general, it seems to me that probability theory favors the Borel algebra more than the algebra of Lebesgue measurable sets. My question is: why?

In every course in measure theory, one is taught of the notion of a complete measure and completion of measures and I got the impression that a complete measure space is somewhat superior to a non-complete one (or at least that completeness makes life a bit easier on the technical level), so this preference of Borel sets puzzles me.

Best Answer

One should be careful with the definitions here. Notation: Given measurable spaces $(X, \mathcal{B}_X), (Y, \mathcal{B}_Y)$, a measurable map $f : X \to Y$ is one such that $f^{-1}(A) \in \mathcal{B}_X$ for $A \in \mathcal{B}_Y$. To be explicit, I'll say $f$ is $(\mathcal{B}_X, \mathcal{B}_Y)$-measurable.

Let $\mathcal{B}$ be the Borel $\sigma$-algebra on $\mathbb{R}$, so the Lebesgue $\sigma$-algebra $\mathcal{L}$ is its completion with respect to Lebesgue measure $m$. Then for functions $f : \mathbb{R} \to \mathbb{R}$, "Borel measurable" means $(\mathcal{B}, \mathcal{B})$-measurable. "Lebesgue measurable" means $(\mathcal{L},\mathcal{B})$ measurable; note the asymmetry! Already this notion has some defects; for instance, if $f,g$ are Lebesgue measurable, $f \circ g$ need not be, even if $g$ is continuous. (See Exercise 2.9 in Folland's Real Analysis.)

$(\mathcal{L}, \mathcal{L})$-measurable functions are not so useful; for instance, a continuous function need not be $(\mathcal{L}, \mathcal{L})$-measurable. (The $g$ from the aforementioned exercise is an example.) $(\mathcal{B}, \mathcal{L})$ is even worse.

Given a probability space $(\Omega, \mathcal{F},P)$, our random variables are $(\mathcal{F}, \mathcal{B})$-measurable functions $X : \Omega \to \mathbb{R}$. The Lebesgue $\sigma$-algebra $\mathcal{L}$ does not appear. As mentioned, it would not be useful to consider $(\mathcal{F}, \mathcal{L})$-measurable functions; there simply may not be enough good ones, and they may not be preserved by composition with continuous functions. Anyway, the right analogue of "Lebesgue measurable" would be to use the completion of $\mathcal{F}$ with respect to $P$, and this is commonly done. Indeed, many theorems assume a priori that $\mathcal{F}$ is complete.

Note that, for similar reasons as above, we should expect $f(X)$ to be another random variable when $f$ is Borel measurable, but not when $f$ is Lebesgue measurable. Using $(\mathcal{F}, \mathcal{L})$ in our definition of "random variable" would not avoid this, either.

The moral is this: To get as many $(\mathcal{B}_X, \mathcal{B}_Y)$-measurable functions $f : X \to Y$ as possible, one wants $\mathcal{B}_X$ to be as large as possible, so it makes sense to use a complete $\sigma$-algebra there. (You already know some of the nice properties of this, e.g. an a.e. limit of measurable functions is measurable.) But one wants $\mathcal{B}_Y$ to be as small as possible. When $Y$ is a topological space, we usually want to be able to compose $f$ with continuous functions $g : Y \to Y$, so $\mathcal{B}_Y$ had better contain the open sets (and hence the Borel $\sigma$-algebra), but we should stop there.