Understanding the proof of the implicit function theorem given the inverse function theorem

implicit-function-theoremmultivariable-calculusreal-analysis

I've been spending a reasonable chunk of time trying to understand the proof, but I have some questions that are getting in the way. This is going to be a pretty long post, sorry about that.

First, let me state the theorem as given in my lecture notes.

Let $U$ be an open subset of $\mathbb{R}^{n+m}$ and $f : U \to \mathbb{R}^n$, $n \geq 2, 1 \leq m$ be at least once continuously differentiable. We'll write $f(x) = f(x_1,\dots,x_m,y_1,\dots,y_n) = f(x,y)$. Assume that at a point $(x_0,y_0) \in U$:

  1. $f(x_0,y_0) = 0$

  2. $D_y f(x_0,y_0)$ is invertible.

Then there exist open neighborhoods $W \subset \mathbb{R}^m$ and $W' \subset \mathbb{R}^n$, where $W \times W' \subset U$ and $(x_0,y_0) \in W \times W'$, and exactly one $C^1$ mapping $g : W \to W'$ such that the claims 1 and 2 apply for $f(x, g(x))$.

$\underline{\text{Now for the actual proof:}}$

$\textbf{Section A}$

First we'll define the usual help function, $F : U \to \mathbb{R}^{n+m}$, $F(x,y) = (x, f(x,y))$. Then $F \in C^1$ and $\det D F(x_0,y_0) = \det D_y f(x_0,y_0) \neq 0$, based on assumption 2.

By the inverse function theorem, $F$ is a local diffeomorphism and thus has a $C^1$ inverse in a neighborhood $V \subset U$ around the point $(x_0,y_0)$.

Now the function $\bar{F} := F : V \to V' := F(V)$ has the inverse map:

$$\bar{F}^{-1} = (\eta, \xi) : V' \to V.$$

Especially: $$(x,y) = \bar{F}(\bar{F}^{-1}(x,y)) = (\eta(x,y),f(\eta(x),\xi(y))) \quad \forall (x,y) \in V'. $$

Therefore $\eta(x,y) = x$ and $f(x,\xi(x,y)) = y$ when $(x,y) \in V'$.

$\textbf{Section B}$ (The existence of $g$)

Fix an open neighborhood of $(x_0,y_0)$, $V_x \times V_y \subset V$. Then by the continuity of $\bar{F}^{-1}$, the point

$$\bar{F}(x_0,y_0) = (x_0,f(x_0,y_0)) = (x_0, 0) \in V' $$

has an open neighborhood, ${V_x}' \times {V_y}' \subset V'$ such that

$$\bar{F}^{-1}({V_x}' \times {V_y}') = {V_x}' \times \xi({V_x}' \times {V_y}') \subset {V_x}' \times {V_y}'. $$

We'll define $g : {V_x}' \stackrel{\textrm{into}}{=} V_y, \quad g(x) = \xi(x,0)$

(Note that we'll define $W := {V_x}'$ and $W' := V_y$ for consistency of notation with the theorem.)

Then $g$ fulfills the criteria (1) and (2) for all $x \in W$.

$\textbf{Section C}$ (The uniqueness of the solution)

Assume there exists a $C^1$ function, $h : W \to W'$ such that $f(x,h(x)) = 0$ for all $x \in W$.

Then we have: $$(x,h(x)) = \bar{F}^{-1}(\bar{F}(x,h(x))) = \bar{F}^{-1}(x,f(x,h(x))=0) = \bar{F}^{-1}(x,f(x,g(x))=0) = \bar{F}^{-1}(\bar{F}(x,g(x))) = (x,g(x)), $$

So $h(x) = g(x)$ for all $x \in W$, which completes the proof.

$\textbf{Discussion of the proof}$

Now, I understand most things up until the use of the inverse function theorem, where we get the local diffeomorphism, and thus a local bijection and inverse.

I don't understand how the functions $\eta$ and $\xi$ are defined, however, what their motivations are. They're written with $\eta(x), \xi(y)$ first, then with both variables. I can't seem to really grasp what their domains and codomains are with the notation, or why we're defining the inverse exactly as two separate functions in the first place.

Section B is just… a mess. Without proper motivation and explanation for $\eta(x,y), \xi(x,y)$, I've little hope of seeing what happens there. A rough overview of the proof or something would be very nice.

Appreciate the help, this thing's pretty complicated, somehow.

Best Answer

I don't understand how the functions $η$ and $ξ$ are defined, however, what their motivations are. They're written with $ η ( x ) , ξ ( y )$ first, then with both variables. I can't seem to really grasp what their domains and codomains are with the notation, or why we're defining the inverse exactly as two separate functions in the first place.

I don't believe its right to have $\xi(y)$ instead of $\xi(x,y)$, but part of the proof shows that $\eta$ is $y$ independent.

Recall that $F$ is a map from some subset of $\mathbb R^m\times\mathbb R^n$. A local inverse would therefore map into $\mathbb R^m\times\mathbb R^n$, and therefore every point $F^{-1}(x,y)$ (given by some $x\in\mathbb R^m,y\in\mathbb R^n$ in the domain) in the image of the inverse can be written $$F^{-1}(x,y)=(\eta(x,y),\xi(x,y))$$ where $\eta(x,y)\in\mathbb R^m$ and $\xi(x,y)\in\mathbb R^n$. (I'll continue to avoid precisely stating the co/domains and ignore the tildes; I think this is a good first approximation)

So far it was just Inverse Function Theorem. Then now we use the structure of the helper function in the form $F F^{-1}=\operatorname{id}$ i.e. $FF^{-1}(x,y)=(x,y)$ to see that $$(x,y)=F(F^{-1}(x,y)) = F(\eta(x,y),\xi(x,y)) = (A, f(A,B))\Big|_{\substack{A=\eta(x,y) \\ B = \xi(x,y)}} = (\eta(x,y),f(\eta(x,y),\xi(x,y))$$ i.e. $x = \eta(x,y)$ and $f(\xi(x,y)) = y$. Thus, the function $\eta$ doesn't depend on $y$, and the function $f(\eta(x,y),\xi(x,y))$ [which is a posteriori just $f(x,\xi(x,y))$] is constant in $x$. More than that, for each $y_0$ in the image, we have that $$f(x,\xi(x,y_0)) = y_0.$$ By translation this $y_0$ is without loss of generality $0$, but its not at all needed to replace $y_0$ as long as its a fixed object. So we've found a function $\xi(x,y_0)$ that does the right thing, if you sweep the details about the domain under the rug:

  1. It maps from $x$ in the image of $F$ (which is just the same $x$ in the domain of $f$) to $y$ in the domain of $f$.
  2. its $C^1$ since its the second component of the restriction of $F^{-1}\in C^1$ to $y=y_0$.
  3. It solves $f(x,\xi(x,y_0)) = y_0$ for every $x$.
Related Question