Questions about Rudin’s proof of Lebesgue’s Monotone Convergence Theorem

analysismeasure-theoryproof-explanationreal-analysis

I just finished working through Papa Rudin's proof of Lebesgue's Monotone Convergence Theorem, and I have some questions:

  1. Why are each $E_n$ measurable?
  2. How is $(6)$ concluded, i.e. $\displaystyle \alpha \ge \int_X s\ d\mu$?
    $(5)$ holds for all $0<c<1$, and not $c=1$ itself. Did we do something like taking $\displaystyle \lim_{c\to 1}$ on both sides of the inequality $(5)$? Is that allowed (and why)?
  3. How do we get from $(6)$ to $(7)$? $(6)$ holds for every simple measurable $s$ satisfying $0\le s\le f$, but $f$ itself may not be simple! Are we supposed to use Theorem 1.17 (approximating measurable functions on $[0,\infty]$ by simple measurable functions) here – and construct a sequence of simple measurable functions $\{s_n\}_{n\in\Bbb N}$ such that $0\le s_1 \le s_2 \le \ldots \le f$, and $s_n \stackrel{n\to\infty}{\longrightarrow} f$ pointwise? Then $(6)$ would hold for every $s_n$ ($n\in \Bbb N$), and taking $\displaystyle \lim_{n\to\infty}$ should give $(7)$. Am I correct about this?
  4. Finally, why are the $E_n$ constructed in such a way? $$E_n = \{x:f_n(x) \ge cs(x)\}$$ seems out of the blue! What is the motivation behind this (it definitely does the job, no doubt – but why this particular choice of $E_n$)?

Proof attached for reference:

enter image description here
enter image description here

Thanks a lot for your time and help!

Best Answer

Question 1.

For any measurable functions $f,g:X\to[0,\infty]$, it is a standard exercise (which you should definitely attempt by yourself; otherwise there are certainly questions about this on this site too) that $[f\leq g]:=\{x\in X\,|\,f(x)\leq g(x)\}$ is measurable (from this it follows that $[f\geq g], [f<g], [f>g], [f=g]$ are all measurable). Do you see how this applies to your case?


Question 2.

Yes, you can take limits in $(5)$. For every $c<1$, we have $\alpha\geq c\int_X s\,d\mu$. So, take the limit as $c\to 1^-$ on both sides. You should have definitely seen in a basic analysis course that inequalities are preserved under limits.


Question 3.

From $(6)$ to $(7)$ it's simply applying the definition of the Lebesgue integral as a supremum. We have defined $\alpha$ in the beginning, and what $(6)$ is showing is that for every simple $0\leq s\leq f$, we have $\int_Xs\,d\mu\leq \alpha$. In other words, the set of numbers $\left\{\int_Xs\,d\mu\,:\, \, \text{$0\leq s\leq f$ is simple}\right\}$ has $\alpha$ as an upper bound. Thus by definition of supremum, we have that \begin{align} \sup\left\{\int_Xs\,d\mu\,:\, \, \text{$0\leq s\leq f$ is simple}\right\} \leq \alpha. \end{align} But the LHS is none other than $\int_Xf\,d\mu$, by definition.


Question 4.

The inequality $\lim\limits_{n\to \infty}\int_Xf_n\,d\mu =\alpha \leq \int_Xf\,d\mu$ is a very trivial consequence of monotonicity of the integral. The other direction is not so trivial, because the definition of $\int_Xf\,d\mu$ involves a huge supremum over all simple functions. Also, up to this point in the treatment, the "only" thing we really know about integrals are basic facts (theorem 1.24), and the only integrals we can explicitly calculate are those of simple functions. The idea is therefore to somehow reduce the more complicated problem of proving the reverse inequality to something more familiar.

By definition of the Lebesgue integral using supremum, showing that $\int_Xf\,d\mu\leq \alpha$ is thus equivalent to showing that for every simple $0\leq s\leq f = \lim f_n$, we have $\int_Xs\,d\mu \leq \alpha$. It would be nice if we could say something like "because $s$ is smaller than $f$ and since $f_n$ increases to $f$, thus for large $n$, we have $s\leq f_n$". If we could say this, then we have $\int_Xs\,d\mu \leq \int_Xf_n\,d\mu$ for all large $n$, and hence by taking limits, $\int_Xs\,d\mu \leq \alpha$, thereby completing the proof.

Unfortunately, this isn't quite right. We CANNOT deduce that for large $n$, that $s\leq f_n$. Thus, what can we do? We give ourselves some room for error, by introducing the scaling factor $0<c<1$. Then, $cs$ will be an overall scaled down function, and for this, we can obtain a nice control: $cs\leq f_n$ on $E_n$, and $\bigcup E_n = X$. So by Rudin's argument, we thus deduce $c\int_Xs\,d\mu \leq \alpha$. Then, finally, we let $c\to 1^-$.

This final idea of "giving yourself some room for error" is a VERY VERY VERY common idea in analysis. If you've studied baby Rudin (or Spivak or any other introductory book), you must have surely seen such ideas. The simplest example of this I can think of is that given $z\in \Bbb{C}$, we have $z=0$ if and only if for every $\epsilon>0$, we have $|z|\leq \epsilon$. These are equivalent statements, but sometimes, the second statement is easier to prove, because you have an $\epsilon$ amount of wiggle room to establish the inequality $|z|\leq \epsilon$.

Related Question