The need for outer measure when extending measure

measure-theory

I am curious why Ash et al. (1999) introduces the outer measure when the authors try to extend the measure to a large class of sets.

Here is the basic roadmap they take:

  1. Begin with $\mathscr{F}_0$, a field of subsets of a set $\Omega$. Let $P$ be a probability measure on $\mathscr{F}_0$. Consider increasing sequences of sets where the limiting sets may not belong to $\mathscr{F}_0$, but establish that if the two limiting sets $A\subset A'$, then $\lim_{m\rightarrow\infty}P(A_m)\leq\lim_{n\rightarrow\infty}P(A'_n)$. If both sequences increase to the same limit, it is equality.
  2. The authors produce a larger set, $\mathscr{G}$, which is the collection of all limits of increasing sequences of sets in $\mathscr{F}_0$, essentially the collection of all countable unions of sets in $\mathscr{F}_0$. Because in (1), we established how the probability measure $P$ behaves in the limit of increasing sequences, this first extension of $P$ to $\mathscr{G}$ is natural. This extended measure is denoted as $\mu$ on $\mathscr{G}$.
  3. Now, the authors extend $\mu$ to the class of all subsets of $\Omega$, which also seems natrual, but the authors comment that the extension will NOT be countably additive on all subsets, but only on a smaller $\sigma$-field. Then, they introduce the outer measure definition as. $\mu^*(A)=\inf\{\mu(G):G\in\mathscr{G},A\subset G\}.$

My Question:
Why do we introduce this $\mu^*$ in step 3 of the extension of measures? Also, is this $\mu^*$ pictorially putting smaller boxes (i.e. bunch of subsets of $\Omega$ over $A$) to cover $A$?

Reference:
$\textit{Probability and Measure Theory}$ (Robert B. Ash and Catherine A. Doleans-Dade), Harcourt/Academic Press, 1999.

Best Answer

1. Why $\mu^*$?

This measure $\mu^*$ not only extends $\mu$, but it has some good properties too. It can be proved that the $\sigma$- algebra $\mathcal{M}_{\mu^*}$ of $\mu^*$-sets ( the sets $A \subseteq \Omega$ such that $\mu^*(E)=\mu^*(E\cap A) +\mu^*(E\cap A^c)$ for every $E\subseteq \Omega$ ) contains the completion $\overline{\mathscr{G}}$ of $\mathscr{G}$ and that $\mu^*(A)=\overline{\mu}(A)$ for every $A\in \overline{\mathscr{G}}$. Moreover, if $\mu$ is $\sigma$-finite, then $\overline{\mathscr{G}}=\mathcal{M}_{\mu^*}$ and $\mu^*|_{\mathcal{M}_{\mu^*}}=\overline{\mu}$. This is important, because that shows that the Lebesgue measure in $(\mathbb{R}, \mathcal{M}_{\lambda^*})$ is just the completion of the Lebesgue measure in $(\mathbb{R},B(\mathbb{R}))$.

2. Why smaller boxes?

The most common example of an outer measure is defined in $\mathbb{R}$ by

\begin{equation} \lambda^*(A)=\text{inf }\bigg\{\sum_{j=1}^{\infty} \lambda(I_j): A\subseteq \bigcup\limits_{j=1}^\infty I_j, I_j \text{intervals} \bigg\} \end{equation}

where $A\subseteq \mathbb{R}$ and $\lambda$ is the measure which assigns to each interval it's length. (The $\sigma$-algebra generated by all intervals is simply $\mathcal{B}(\mathbb{R})$)

Αs you said, in order to "measure" an $A \subseteq \mathbb{R}$, we cover A with intervals (the smaller boxes) and we find the total length of them. The "smallest" possible total length we can find is $\lambda^*(A)$.

It is obvious that $\lambda^*$ extends $\lambda$. But why do we need intervals to define $\lambda^*$? The key here is to describe $\lambda^*$ using measures that we know. If we used arbitrary $A_n$ instead of intervals, eventually we would have to talk about $\lambda(A_n)$, or, in other words, we would have to talk about the "lengths" of $A_n\subseteq\mathbb{R}$. However, this is a thing we want to avoid when $A_n\subseteq\mathbb{R}$ are arbitrary subsets of $\mathbb{R}$, because there is a chance that they are a little "weird", like the Vitali sets (in fact there are uncountable many such sets in $\mathbb{R}$), for which we can't talk about their length. Furthermore, in these sets, the outer measure may not be countably additive, although it is true that it is additive on $\mathcal{M}_{\lambda^*}$.

Related Question