[Math] Theorem 7.26 in Baby Rudin: The Stone Weierstrass Theorem

analysisproof-explanationreal-analysissequences-and-seriesuniform-convergence

Here is Theorem 7.26 in the book Principles of Mathematical Analysis by Walter Rudin, 3rd edition:

If $f$ is a continuous complex function on $[a, b]$, there exists a sequence of polynomials $P_n$ such that
$$ \lim_{n \to \infty} P_n(x) = f(x) $$
uniformly on $[a, b]$. If $f$ is real, the $P_n$ may be taken real.

Here is Rudin's proof:

We may assume, without loss of generality that $[a, b] = [0, 1]$. [I would like to have an explicit argument on how the truth of this result for the unit closed interval $[0, 1]$ would lead to the truth of this result for an arbitrary closed interval $[a, b]$.] We may also assume that $f(0) = f(1) = 0$. For if the theorem is proved for this case, consider $$ g(x) = f(x) – f(0) – x [ f(1) – f(0) ] \qquad (0 \leq x \leq 1). $$ Here $g(0) = g(1) = 0$, and if $g$ can be obtained as the limit of a uniformly convergent sequence of polynomials, it is clear that the same is true for $f$, since $f-g$ is a polynomial.

Furthermore, we define $f(x)$ to be zero for $x$ outside $[0, 1]$. Then $f$ is uniformly continuous on the whole line. [How is $f$ uniformly continuous? How to show this rigorously?]

We put
$$\tag{47} Q_n(x) = c_n \left( 1- x^2 \right)^n \qquad (n = 1, 2, 3, \ldots), $$
where $c_n$ is chosen so that
$$ \tag{48} \int_{-1}^1 Q_n(x) \ \mathrm{d} x = 1 \qquad (n = 1, 2, 3, \ldots). $$
We need some information about the order of magnitude of $c_n$. Since
$$
\begin{align}
\int_{-1}^1 \left( 1-x^2 \right)^n \ \mathrm{d} x = 2 \int_0^1 \left( 1-x^2 \right)^n \ \mathrm{d} x &\geq 2 \int_0^{1/\sqrt{n}} \left( 1-x^2 \right)^n \ \mathrm{d} x \\
&\geq 2 \int_0^{1/\sqrt{n}} \left( 1- n x^2 \right) \ \mathrm{d} x \\
&= \frac{4}{3 \sqrt{n} } \\
&> \frac{1}{ \sqrt{n} },
\end{align}
$$
it follows from (48) that $$ \tag{49} c_n < \sqrt{n}. $$

The inequality $\left( 1-x^2 \right)^n \geq 1-nx^2$ which we used above is easily shown to be true by considering the function
$$ \left( 1- x^2 \right)^n – 1+nx^2 $$
which is zero at $x= 0$ and whose derivative is positive in $(0, 1)$.

For any $\delta > 0$, (49) implies
$$ \tag{50} Q_n(x) \leq \sqrt{n} \left( 1- \delta^2 \right)^n \qquad ( \delta \leq \lvert x \rvert \leq 1), $$
so that $Q_n \to 0$ uniformly in $\delta \leq \lvert x \rvert \leq 1$. [Is this fact really needed in this proof?]

Now set
$$ \tag{51} P_n(x) = \int_{-1}^1 f(x+t) Q_n (t) \ \mathrm{d} t \qquad (0 \leq x \leq 1). $$
Our assumptions about $f$ show, by a simple change of variable, that
$$ P_n(x) = \int_{-x}^{1-x} f(x+t) Q_n(t) \ \mathrm{d} t = \int_0^1 f(t) Q_n(t-x) \ \mathrm{d} t, $$
and the last integral is clearly a polynomial in $x$. [How to demonstrate this explicitly?] Thus $\left\{ P_n \right\}$ is a sequence of polynomials, which are real if $f$ is real.

Given $\varepsilon > 0$, we choose $\delta > 0$ such that $\lvert y-x \rvert < \delta$ implies $$ \lvert f(y) – f(x) \rvert < \frac{\varepsilon}{2}. $$
Let $M = \sup \lvert f(x) \rvert$. Using (48), (50), and the fact that $Q_n(x) \geq 0$, we see that for $0 \leq x \leq 1$,
$$
\begin{align}
\left\lvert P_n(x) – f(x) \right\rvert &= \left\lvert \int_{-1}^1 [ f(x+t) – f(x) ] Q_n(t) \ \mathrm{d} t \right\rvert \\
&\leq \int_{-1}^1 \lvert f(x+t) – f(x) \rvert Q_n(t) \ \mathrm{d} t \\
&\leq 2M \int_{-1}^{-\delta} Q_n(t) \ \mathrm{d} t + \frac{\varepsilon}{2} \int_{-\delta}^\delta Q_n(t) \ \mathrm{d} t + 2 M \int_\delta^1 Q_n(t) \ \mathrm{d} t \\
&\leq 4M \sqrt{n} \left( 1 – \delta^2 \right)^n + \frac{\varepsilon}{2} \\
&< \varepsilon
\end{align}
$$
for all large enough $n$, which proves the theorem. [In the hindsight, I think in the last chain we are better served with a $\delta$ such that $0 < \delta < 1$. Am I right?]

I have here reproduced Rudin's proof and (through my questions and remarks enclosed within pairs of brackets) have asked for clarification of those points in the proof that cause me confusion.

Hope the learned Math Stack Exchange community will come to my rescue!!

Is (are) there any easier (or alternative) proof(s) of this very theorem? If so, I would be highly appreciative of any references to such a proof (or proofs)!

Best Answer

Lots of questions here. I think I've had a crack at all of them, but let me know if I've missed something, or if I've made a mistake somewhere. I might be light on detail in a lot of places; apologies in advance.

First: Why is proving the result for functions on $[0,1]$ sufficient?

Given that we have the result for functions on $[0,1]$, we can define the function $$g(x) = f(a + (b-a)x),$$ and apply the result to $g$, which is a function on $[0,1]$. This gives us a sequence of polynomials $\{ P_n \}$ on $[0,1]$. Given that

$$ f(x) = g\left(\frac{x-a}{b-a}\right), $$

and $P_n\left(\frac{x-a}{b-a}\right)$ is also polynomial, we have the desired sequence of polynomials approximating $f$.

Second: Why is $f$ uniformly continuous?

$f$ is a continuous function on a compact set, and is thus uniformly continuous on $[0,1]$. It is also constant outside of $[0,1]$, so there is nothing going on on the real line that "breaks" uniform continuity. Showing this rigorously entails a simple application of the definition of uniform continuity once you've established by the argument above that $f$ is uniformly continuous on $[0,1]$.

Alternatively, you can view this as a consequence of the pasting lemma for uniformly continuous functions.

Third: Is the fact that $Q_n \to 0$ uniformly on $\delta \le \lvert x \rvert \le 1$ needed in the proof?

You're right, this specific fact seems to play no role in the proof. That is, once we have the bound that $Q_n (x) \le \sqrt n \left( 1- \delta^2 \right)^n $ for $\delta \le \lvert x \rvert \le 1$, we have no need for the uniform convergence of $Q_n$. This is but natural, since the aforementioned bound is what implies uniform convergence.

Of course, one can still prove the theorem (in particular, the second to the last inequality) by dispensing with the specific bound and using the fact that $Q_n \to 0$ uniformly, but that seems to require extra notation, if not some work to set it up.

Fourth: How do you demonstrate that $ \int_0^1 f(t) Q_n (t -x) \, \mathrm dt $ is a polynomial in $x$?

Observe that

$$ \int_0^1 f(t) Q_n (t -x) \, \mathrm dt = \int_0^1 f(t) c_n \left(1 - (t-x)^2\right)^n \, \mathrm d t. $$

Standard manipulations (like, say, the binomial theorem) allow you to expand the expression in the integral. Any terms involving $t$ are integrated out, and all you're left with is a function of $x$ that is (hopefully) clearly a polynomial.

Finally: Should $\delta \in (0,1)$?

I think Rudin snuck this assumption in without making it clear. See, for example, $(50)$. Note also the condition that $\lvert y-x \rvert < \delta$ seems to suggest that $\delta \le 1$, since $x,y \in [0,1]$. In any case, you actually do need $\delta \in (0,1)$ for the last inequality to work for large $n$.