A few $(3)$ questions regarding Spivak’s proof of the Inverse Function Theorem.

analysismultivariable-calculusproof-explanationreal-analysis

I have a few questions regarding Spivak's proof of the Inverse Function Theorem:

Theorem: Let $f$ be a function $\mathbb{R}^n\to\mathbb{R}^n$. If

$\ \ \ \ a)$ $f$ is $C^1$ in an open set containing $a\in\mathbb{R}^n$, and

$\ \ \ \ b)$ $f'(a)$ is invertible i.e. $\det f'(a) \ne 0$,

then there is an open set $V$ containing $a$ and an open set $W$ containing $f(a)$ such that

$f:V\to W$ is bijective.
$f^{-1}:W\to V$ is $C^1$.
The equation
$$(f^{-1})'(y) = \left[f'(x)\right]^{-1}$$
holds for any $y\in W$ and $x := f^{-1}(y)$.

Proof: Letting $\lambda = f'(a)$, we have that
$$(\lambda^{-1}\circ f)'(a) = (\lambda^{-1})'(f(a))\circ \lambda = \lambda^{-1}\circ \lambda = \text{Id}.$$
That is, if the theorem holds for $\lambda^{-1}\circ f$, then it clearly holds for $f$. Therefore we may assume at the outset that $f'(a)$ is the identity. Whenever $f(a + h) = f(a)$ we have
$$1 = \lim_{h\to 0}\frac{|h|}{|h|}
= \lim_{h\to 0}\frac{|f(a+h)-f(a)-\lambda(h)|}{|h|}=0.$$
Meaning we cannot have $f(x)=f(a)$ for $x$ arbitrarily close to, but unequal to, $a$. Therefore there is a closed rectangle $U$ containing $a$ in its interior such that

$f(x)\ne f(a)$ for all $x\in U$ different than $a$.

Since $f$ is $C^1$ in an open set containing $a$, we can also assume that

$\det f'(x) \ne 0$ for all $x\in U$.
$|\partial_jf_i(x) – \partial_jf_i(a)|<1/2n^2$ for all $i,j$, and $x\in U$.

Observe that
$$|x_1-x_2|-|f(x_1)-f(x_2)|
\le |f(x_1)-x_1 – [f(x_2)-x_2]|
\le \frac{1}{2}|x_1-x_2|$$
where the first inequality comes from the Inverse Triangle Inequality, while the second is derived by $(3)$ and a Lemma. We obtain

$|x_1-x_2|\le 2|f(x_1)-f(x_2)|$ for $x_1,x_2\in U$.

Now $f(\partial U)$ is a compact set which, by $(1)$, does not contain $f(a)$. Therefore there is a number $d>0$ such that $|f(a)-f(x)|\ge d$ for $x\in \partial U$. Let
$$W:= \{ y:|y-f(a)|<d/2 \} .$$
We have

$|y-f(a)| < |y-f(x)|$ for any $y\in W$ and $x\in \partial U$.

We will show that for any $y\in W$ there is a unique $x\in \text{Int}(U)$ such that $f(x)=y$. To prove this consider the function $g:U\to \mathbb{R}$ defined by
$$g:x\mapsto |y-f(x)|^2 = \sum_{i=1}^n\big(y_i-f_i(x)\big)^2.$$
This function is continuous and therefore has a minimum on $U$. If $x\in \partial U$, then, by $(5)$, we have $g(a) < g(x)$. Therefore the minimum of $g$ does not occur on the boundary of $U$. There is a point $x\in \text{Int}(U)$ such that $\partial_jg(x) = 0$ for all $j$, that is
$$\sum_{i=1}^n 2\big(y_i-f_i(x)\big)\partial_jf_i(x)=0 \ \ \ \ \forall j.$$
By (2) the matrix $(\partial_jf_i(x))$ has non-zero determinant. Therefore we must have $y_i – f_i(x) = 0$ for all $i$, that is $y = f(x)$.

Question 1: why does the equation above and $(2)$ imply $y_i – f_i(x) = 0$ for all $i$?

This proves the existence of $x$. Uniqueness follows immediately from $(4)$.
If $V = (\text{Int}(U)) \cap f^{-1}(W)$, we have shown that the function $f:V\to W$ has an inverse $f^{-1}: W \to V$. We can rewrite $(4)$ as

$|f^{-1}(y_1)-f^{-1}(y_2)|\le 2|y_1-y_2|$ for $y_1,y_2\in W$.

This shows that $f^{-1}$ is continuous. Only the proof that $f^{-1}$ is differentiable remains. Let $\mu = f'(x)$. We will show that $f^{-1}$ is differentiable at $y$ with derivative $\mu^{-1}$. For $x_1\in V$, we have
$$f(x_1) = f(x) + \mu(x_1-x) + \phi(x_1-x)$$
where
$$\lim_{x_1\to x} \frac{|\phi(x_1-x)|}{|x_1-x|}=0.$$
Therefore
$$\mu^{-1}\big(f(x_1)-f(x)\big) = x_1 – x + \mu^{-1}\big(\phi(x_1-x)\big).$$
Since every $y_1\in W$ is of the form $f(x_1)$ for some $x_1\in V$, this can be written as
$$f^{-1}(y_1) = f^{-1}(y) + \mu^{-1}(y_1-y) – \mu^{-1}\big(\phi(f^{-1}(y)-f^{-1}(y))\big),$$
and it therefore suffices to show that
$$\lim_{y_1\to y}\frac{\left|\mu^{-1}\big(\phi(f^{-1}(y_1)-f^{-1}(y))\big)\right|}{|y_1-y|}=0.$$
Therefore it suffices to show that
$$\lim_{y_1\to y}\frac{\left|\phi(f^{-1}(y_1)-f^{-1}(y))\right|}{|y_1-y|}=0.$$

Question 2: the trick here -I believe- requires us to divide (and multiply) by $\phi(f^{-1}(y_1)-f^{-1}(y))$. How do we know said quantity is non-zero?

Now
$$\frac{\left|\phi(f^{-1}(y_1)-f^{-1}(y))\right|}{|y_1-y|}
= \frac{\left|\phi(f^{-1}(y_1)-f^{-1}(y))\right|}{|f^{-1}(y_1)-f^{-1}(y)|}
\frac{|f^{-1}(y_1)-f^{-1}(y)|}{|y_1-y|}.$$
Since $f^{-1}$ is continuous we have that $f^{-1}(y_1)\to f^{-1}(y)$ as $y_1\to y$. Therefore the first factor approaches $0$. Since, by $(6)$, the second factor is less than $2$, the product also approaches $0$.

Question 3: in some versions of the theorem, the function $f^{-1}$ is said to be $C^1$, yet Spivak does not state -nor prove- such fact. How can it be proven?

Best Answer

Let $A$ be the $n\times n$ matrix $[\partial_jf_i(x)]$ (or maybe its transpose, I can’t be bothered to check carefully, but it doesn’t matter), and let $\xi:=y-f(x)\in\Bbb{R}^n$. The equation at hand is $A\cdot\xi=0$ (or maybe $A^t\cdot \xi=0$). Since $A$ has non-zero determinant it is invertible (which is equivalent to $A^t$ being invertible, hence my being so cavalier above regarding transposes) and thus it implies $\xi=0$, i.e $y-f(x)=0$.
No, that’s not the trick. The point is that $\mu=Df_x:\Bbb{R}^n\to\Bbb{R}^n$ is a linear transformation, and therefore continuous (Spivak’s problem 1-10). Actually, it is a general fact that every linear transformation between finite-dimensional normed vector spaces is continuous, but Spivak is only working on $\Bbb{R}^n$, so he only needs the much simpler statement which he left as an exercise in 1-10. So, we have $\mu^{-1}$ is a continuous linear transformation as well, so rather than showing \begin{align} \lim\limits_{y_1\to y}\frac{\left|\mu^{-1}\big(\phi(f^{-1}(y_1)-f^{-1}(y))\big)\right|}{|y_1-y|}=\lim\limits_{y_1\to y}\left|\mu^{-1}\left(\frac{\phi(f^{-1}(y_1)-f^{-1}(y))}{|y_1-y|}\right)\right|=0, \end{align} (the first equal sign uses linearity of $\mu^{-1}$… well homogeneity to be specific) it suffices (by continuity of $\mu^{-1}$) to show that the stuff inside $\mu^{-1}$ approaches $0$. Then, he divides by $|f^{-1}(y_1)-f^{-1}(y)|$ (when $y_1\neq y$ of course); and this is justified by injectivity of $f^{-1}$.
If you look at the end of the book, Spivak mentions in the addendum that he should have added a remark after the theorem that the given formula for the derivative of $f^{-1}$ (namely $D(f^{-1})_y=[Df_{f^{-1}(y)}]^{-1}$) shows that $D(f^{-1})$ is the composition of the inversion map $\text{inv}: \text{GL}(\Bbb{R}^n)\to \text{GL}(\Bbb{R}^n)$ with $Df$ with $f^{-1}$, which is a composition of continuous maps and hence continuous. Therefore, $f^{-1}$ is $C^1$. In fact, you can generalize this using induction to see that if $f$ is of class $C^k$ (where $1\leq k\leq \infty$) then so is $f^{-1}$. Note that Spivak justifies continuity of the inversion map $\text{inv}$ using Cramer’s explicit formula for the matrix inverse; but if you look at Cramer’s formula, you’ll see that matrix inversion is in fact an analytic map; this is not just a special feature of inversion in $\Bbb{R}^n$, but a general feature in Banach algebras (the proof of analyticity essentially mimics the usual power series proof for $\frac{1}{1-z}=\sum_{n=0}^{\infty}z^n$). The reason I’m mentioning all this to you is that you shouldn’t be so worried as to why Spivak isn’t giving you “all” the information (Calculus on Manifolds text is his first and most concise and terse text (some people hate it, while others love it for this very reason)). The theorems’ conclusions can be strengthened even if he doesn’t say so (and in fact, there is an analytic version of the inverse function theorem as well, even in the infinite-dimensional Banach space case. The proof of course will have to be modified: this ‘simple’ compactness argument will no longer work since the closed unit ball is never compact in infinite-dimensional Banach spaces; instead one starts off by approximating solutions to the equation $y=f(x)$, and then uses Banach’s contraction mapping fixed point lemma to prove the necessary existence of solutions).

Best Answer

Related Solutions

Calculus – Spivak’s Proof of the Inverse Function Theorem

[Math] Inverse Function Theorem, Spivak’s Proof

Related Question