Why is $\sigma$-algebra necessary to define a measure

analysislebesgue-measuremeasure-theoryreal-analysis

I am reading a book (Teoría de la medida, Jaime San Martín, unfortunately it is not in PDF), and it gives me the following definition of measure:

Let $X$ a set and $\mathcal{C}\subseteq\mathcal{P}(X)$ such that $\emptyset\in\mathcal{C}$. A function $\mu:\mathcal{C}\to\overline{\mathbb{R}}_+$ is said to be a measure on $\mathcal{C}$ if it satisfies:

  1. $\mu(\emptyset)=0$
  2. $\mu$ is $\sigma$-aditive, i.e if $\{A_n\}_{n\in\mathbb{N}}\subseteq\mathcal{C}$ is a sequence of disjoint sets such that $\bigcup_{n\in\mathbb{N}}A_n\in\mathcal{C}$, then $$\mu\left(\bigcup_{n\in\mathbb{N}}A_n\right)=\sum_{n\in\mathbb{N}}\mu(A_n)$$

The sets $A\in\mathcal{C}$ are called measurable.

Note that $\mathcal{C}$ is not a $\sigma$-algebra, unlike other texts I have read, In fact, the book first defines what a measure and two pages later defines what a $\sigma$-algebra is. It's the first time I read that any set is called "measurable", since I have always called only elements of a $\sigma$-algebra this way.

What is the advantage of working on $\sigma$-algebras? I mean, the Lebesgue measure is defined on the Borel $\sigma$-algebra, but $\mathcal{P}(\mathbb{R})$ is also a $\sigma$-algebra and it is not possible to use the Lebesgue measure on it because the Vitali set is in $\mathcal{P}(\mathbb{R})$, and I am not sure in fact if it is possible to define a measure or not in $\mathcal{P}(\mathbb{R})$. This generates conflict to me, because it's possible that there are problems with the definition of measurable sets of this book, and I need to work with this book. And finally, What kind of relationship must a measure have with a $\sigma$-algebra (or according to the definition in this book, any $\mathcal{C}\subseteq\mathcal{P}(X)$) so that there is no inconsistency when measuring the sets and so that there is no confusion when referring to "measurable sets"?

Best Answer

I think you basically answered all your question already by yourself.

The power set is a $\sigma$-algebra but for some cases (e.g., $\mathbb R$) this $\sigma$-algebra contains too many sets so that a measure with certain properties that we would like to have, is not well-defined. The solution is to use a smaller $\sigma$-algebra. This is done in math all the time; for example, a strong solution of a differential equation may not exist. So mathematicians dropped some requirements of the solution and came up with the notion of a weak solution.

Now you may ask why does it have to be a $\sigma$-algebra and not, say, another subset of the power set? You can think of $\sigma$-algebras as something that encodes the notion of information: assume you know all possible outcomes, i.e., the set of interest is contained in the $\sigma$-algebra. Then, if an event happened, you immediately know what event did not happen, i.e., if a set is contained in the $\sigma$-algebra, its complement is also contained in the $\sigma$-algebra. Finally, you can count the events that happened to know what happened overall, i.e., if a countable number of sets are contained in the $\sigma$-algebra, so is their union. A measure now assigns numbers to each event, which we can, for example interpret as probability or volume or something else. Therefore, it's reasonable to define measures on $\sigma$-algebras. This is only an heuristic argument though.

Finally, for some fixed $x\in\mathbb R$, $$\delta_x(A) = \begin{cases} 1 & \text{if $x\in A$} \\ 0 & \text{else,} \end{cases}$$ defines a measure on $\mathcal P(\mathbb R)$.

Related Question