Introduction:
I am having a hard time understanding this Proof. A similar scheme of the proof can be found in countable many books and lectures. However, so far the sources i looked up do not really explain every point of the proof, and the reader is left wondering about many Points. I shall write the proof now and highlight in $\color{Red}{red}$ the points to which i ask "why" or "how" and in $\color{green}{green}$ my Interpretation of the points which i believe to have understood and my Explanation to them.
Theorem:
(Requirement 1) :Let $U\subset \mathbb{R^n} $ be open.
(Requirement 2) :Let $f :U\rightarrow \mathbb{R^n}$ be a continuous differentiale function.
(Requirement 3) :Let $Det(D(f(a))\neq 0$ for some $a \in U$,
Hypothesis : For $b:=f(a)$ there exists open sets $V_a \subset U$ and $V_b \subset \mathbb{R^n}$ such that the following is true:
(Result 1) :$V_b = f(V_a)$
(Result 2):$f_{|_{V_a}} : V_a \rightarrow V_b$ is bijective.
(Result 3)$ :f^{-1} : V_b \rightarrow V_a$ is continuously differentiable.
(Result 4) : is $f\in C^k(U), k\geq 1$ then for the function $f^{-1}:V_b \rightarrow V_a $ applies that $f^{-1} \in C^k(V_b)$
Proof:
"Let us take note at start of the proof that for a general function $f$ applies at the point $a$ that $D(f(a)) = L$ and that in that point $\color{green}{D(L^{-1} \circ f) = DL^{-1} \circ Df= L^{-1} \circ L = I_n}$ $\color{green}{^{***1}}$ Where as we denote the identity matrix with $I_n$ \
*To $\color{green}{^{***1}}$ The Matrix $L^{-1}$ exists because of Requirement 3.
Which states the Determenant of the linear matrix, which represents the derivation, has a none zero determinant, thus it is inversible.
Its inverse is then denoted with $L^{-1}$ which is the same as saying $D(f^{-1}(a))$
A derivative of any linear mapping $L$ is then given to be $DL= L$ Thus also For
$DL^{-1}=L^{-1}$ We recieve thus the given identity. *
$\color{Red}{\text{We can now assume that $D(f(a))= I_n$ without loss of generality, proving the statement for this case is sufficient}}$ $\color{red}{^{***1}}$. We have thus that $\partial f_i(a)/\partial x_j = \delta_{ij}$ Because if $L^{-1}\circ f$ locally inversible, thus also $f=L\circ(L^{-1} \circ f) $ is locally inversible because $L$ is a bijection
*To $\color{red}{^{***1}}$ I do not understand, why it is sufficient to show the truth of the statement to this case and ignore the other cases, it seems that some kind of implication is being derived from the shown relationships,however it does not seem trivial to me to understand it!
$\color{red}
{\text{We now choose a Cuboid $K \subset U \subset \mathbb{R^n} $ which fullfills the following relaitonships:}}$
$1) a\in Int(K)\\
2) Det(D(f_{|K}\neq 0)\\
3)| \partial f_i/\partial x_j(x)-\partial f_i/\partial x_j(a)|\leq 1/2n^2
\forall x\in K $ $\color{red}{^{***2}}$
to $\color{red}{^{***2}}$ why does such Cuboid with these spesific requirements even exist?
Note: We know for a function $f=(f_1,…,f_n)$ such that for a constant M the following is true:
$|\partial f_i/\partial x_j| \leq M$ then it is true that $||f(a)-f(b)||\leq n^2M* ||a-b||$ We will use this relationship to compute the inequality.
Let $g(x) := f(x)-x $ then it is obvious that $\forall x \in K$ that
$|\partial g_i/\partial x_j (x) |= |\partial f_i/x_j (x) – \delta_{ij}| \leq 1/2n^2$
Comment: Due to the choice of $D(f)$ and the said properities of the Cuboid we can compare this equation to the above and recieve the desired right side inequality. No magic here.
Compare now the "Note" to the inequality, we recieve thus:
$ ||g(x_1)-g(x_2) || \leq 1/2 * || x_1-x_2|| $
$ ||g(x_1)-g(x_2) || = ||f(x_1)-x_1-f(x_2)+x_2|| \leq^{triangle inequality} || f(x_1)-f(x_2)|| + || x_2 -x_1|| $ Multiplying bothsides with $-1$ and using the identity that $ ||-x|| = ||x||$ we recieve thus in comparission:
$ || x_1 -x_2||-||f(x_1)-f(x_2)|| \leq ||f(x_1)-x_1-f(x_2)+x_2||=||g(x_1)-g(x_2) || \leq 1/2 * || x_1-x_2|| $
Reorginizing the inequality now we recieve:
$ || x_1-x_2|| \leq 2||f(x_1)-f(x_2)||$ this is equivelant to saying, that the function is on the given cubioid injective.
With $Fr(K)$ we mean the boundary of $K$ say, the frontier.
$\color{red}{\text{It is especially true that if $a\notin Fr(K)\Rightarrow f(a)\notin f(Fr(K))$}}$ $\color{red}{^{***3}}$
$\color{green}{\text{However, $f(Int(K))$ need not to be open, but since $Fr(K)$ is compact thus also $f(Fr(K))$}}$$\color{green}{^{***2}}$
$\color{red}{\text{Thus $f(a)$ has a actually a real positive distance from this compactspace,say, let us call it $\delta$ then it is $ 0 < \delta := [Inf_{x\in Fr(K)}||f(a)-f(x)||]$ }}$ $\color{red}{^{***4}}$
To $\color{red}{^{***3}}$ Why is this implication true and where does it come from?
To $\color{green}{^{***2}}$ Because the function is continious, and $Fr(K)$ is defined on a cuboid, Cuboids are compact and closed, thus $Fr(K)$ is compact. Continious functions carry compactness on their images. But not openeess.
To $\color{red}{^{***4}}$ Why does this distance need to be truly bigger than zero? How does one derive this implication?
We now let $V_b:=[y\in \mathbb{R^n}: ||y-f(a)||< \delta/2]$
for $y\in V_b, x\in Fr(K)$ we have thus the inequalities:
$||y-f(a)||< \delta/2$ and $ ||f(a)-f(x)|| \geq \delta$
$\color{green}{\text{Combining these yields the following inequality}}
$ $\color{green}{^{***3}}$
$ || y-f(a) || < ||y – (f(x) || \forall y \in V_b, x \in Fr(K)$ (Equation i)
To: $\color{green}{^{***3}}:$ i have shown graphically the correctness of this inequality, however i am not being able to derive this rigoriously. it must be some trick using inequalities that i am not finding out! Picture:
Now we want to show that $\forall y \in V_b \exists! x \in Int(K) : f(x)=y $
For that we consider the following function: $y\in V_b, h_y(x): K \rightarrow \mathbb{R}, h_y(x)=||y-f(x)||^2=\sum(y_i-f_i(x))^2 $
Now $\color{green}{\text{$h_y$ takes its absolute minimum on the compact set $K$}}$ $\color{green}{^{***4}}$
To: $\color{green}{^{***4}}$ Is this due to extreme Value theorem, because $h_y$ is continous and its on a compact set?
Now but Equation i implies that for $x\in Fr(K)$ the following applies:
$h_y(a) < h_y(x)$ thus the function $h_y$ can not obtain its minmum on $Fr(K)$ rather it must be in a point $x_o \in Int(K)$ at this point the following applies:
$\frac{\partial h_y}{\partial x_j}(x_0) = 0 = 2*\sum_{i=1}^n(y_i-f_i(x_o))*\frac{\partial f_i}{\partial x_j}(x_0)$
$\color{red}{\text{Because $x_o\in K$ then must apply that $Det(\frac{\partial f_i}{\partial x_j}(x_0) \neq 0$)}}$ $\color{red}{^{***5}}$
to $\color{red}{^{***5}}$ Is not determenant defined for matrices? But the given Partial derivative at a point is not a matrix Rather than a vector, or is it rather meaned that the derivative of f written as all partial derivatives at this point? say $Df(x_o)$
$\color{red}{\text{Thus accordingly follows that $y – f_i(x_o) =0 $ thus $y=f(x_o)$}}$$\color{red}{^{***6}}$
To $\color{red}{^{***6}}$ but what does the determenant not being zero have to do with the equation equaling zero above? i understood one of the terms must be zero but why does the determenant not being zero imply the other one must be zero? can we not have it mixed somehow? say one time the other term is zero and the other time the other is zero.
This implies that $V_b \subset f(K)$
Now because $V_a := Int (K) \cap f^{-1}(V_b)$ we have that $\color{green}{\text{$f_{|V_{A}}:V_a\rightarrow V_a$ is bijective}}$ $\color{green}{^{***5}}$
To $\color{green}{^{***5}}$ We have shown previously that $ f $ is injective on $K$ and then recently that for each $y\in V_b$ there exists only one $f(x_o)= y$ thus follows the bijectivity.
$\color{red}{\text{Both sets $V_a,V_b$ are open. }} $ $\color{red}{^{***7}}$
To $\color{red}{^{***7}}$ Why is this true?
Sofar Result 1 and result 2 have been shown, now we show Result 3.
We can now rewrite the inequality from before using $x_i= f^{-1}(y_i) \in V_a \subset K, i=1,2$ we recieve thus:
$||f^{-1}(y_1)-f^{-1}(y_2)|| \leq 2* || y_1-y_2|| \forall y_1,y_2 \in V_b$
This relationship is equivalent to say, that $(f_{|V_a})^{-1} V_b \rightarrow V_a $ is continious
We need only to show differentiability:
Let $x \in V_a, y\in V_b$ and let $L_x=D(f(x))$ we theorize that $f^{-1}$ in $y$ is differentiable and $Df^{-1}(y)= L^{-1}_x= [D(f(x))]^{-1} $applies. $\color{green}{\text{Truly for $x_1 \in V_a, \exists \phi: \mathbb{R^n} \rightarrow \mathbb{R^n}: f(x_1)= f(x)+ L_x(x_1-x) + \phi(x_1-x)$ and $lim_{x_{1} \rightarrow x} \frac{||\phi(x_1-x)||}{||x_1-x||}= 0$}}$ $\color{green}{^{***5}}$
To: $\color{green}{^{***5}}$ Do we recieve this function by rearanging the equation and taking the limit of $x_1 \rightarrow x$ because $f$ is differentiable and thus at the limit the LHS equals zero? Thus such function exist with the given attribute?
Now we rearange the equation and take $L^{-1} $ of both sides, same argument thus the same function ouput and the equality holds, we recieve with the notation $f(x) = y , f(x_1)=y_1$ the following:
$L_x^{-1}(y_1-y)= f^{-1}(y_1)-f^{-1}(y)+L_x^{-1}(\phi(f^{-1}(y_1)-f^{-1}(y))$
$\color{red}{\text{It is sufficent thus for the differentiability to show that:
$lim_{y_{1}\rightarrow y} \frac{||L_x^{-1}(\phi(f^{-1}(y_1)-f^{-1}(y)||}{||y_1-y||}$}}$ $\color{red}{^{***8}}$
To $\color{red}{^{***8}}$ Why is this sufficent?
We proceed now. We now that for multilinear functions, the following estimation is true
$||L_x^{-1}(\phi(f^{-1}(y_1)-f^{-1}(y)|| \leq A * ||\phi(f^{-1}(y_1)-f^{-1}(y)||$
For some constant $A$ .$\color{red}{\text{Now due to the continuty of $f^{-1}$}}$ $\color{red}{^{***9}}$
$lim_{y{_1}\rightarrow y} \frac{||\phi(f^{-1}(y_1)-f^{-1}(y)||}{||y_1-y||} = lim_{x{_1}\rightarrow x} \frac{||\phi(x_1-x)||}{||f(x_1)-f(x)||} = lim_{x{_1}\rightarrow x} \frac{||\phi(x_1-x)||}{||x_1-x||}*lim_{x{_1}\rightarrow x} \frac{||x_1-x||}{||f(x_1)-f(x)||} = 0 * \text{number less or equal 2, see above} = 0$
Thus we conclude $f^{-1}$ is differentiable and $\color{red}{\text{$Df^{-1}(f(x))=[D(f(x)]^{-1}$}}
$
$\color{red}{^{***9}}$
To $\color{red}{^{***9}}$: At this point, i am not sure if i am just tired, or i can not really see why that applies!
Now let $f:= (f_1,…,f_n): V_a \rightarrow V_b$ with the inverse function $f^{-1}=(u_1,…,u_n):V_b \rightarrow V_a$ Thus $f_i(u_1(y),…,u_n(y)) = y_i$ We differentiate and recieve:
$\sum_{k=1}^n \frac{\partial f_i}{\partial x_k} (f^{-1}(y))\frac{\partial u_k}{\partial y_j}(y) = \delta_{ij}$ Thus the following must apply:$ [\frac{\partial u_k}{\partial y_j}(y)] =[\frac{\partial f_i}{\partial x_k} (f^{-1}(y))]^{-1}$
$\color{red}{\text{Thus every $\frac{\partial u_k}{\partial y_j}(y)$ is a rational function$q_{kj}$ of $\frac{\partial f_i}{\partial x_k} (f^{-1}(y))$. Is now $f \in C^k$ thus follows $f^{-1} \ in C^k$ because $f^{-1}$ is continious.}}$
$\color{red}{^{***10}}$
To
$\color{red}{^{***10}}$ I absolutely have no understanding of this colored line and i do not see why it is a rational function, where the continuity plays a role and why it is also from the same class.
q.e.d
In Advance, i thank the brave souls who will take on this question and bother with it. I reliaze it is very lengthy and probably hard to grasp and one needs to take multiple looks, thus for the engagement and helping me in understanding the gaps, i am thankful. Please do not feel obligated to answer every point. even if you answer some points and someone else answers the other points, i would be very happy.
Best Answer
\begin{equation*} (Df(x_0))(a)=0 \end{equation*} Since $Df(x_0)$ is invertible, it follows that $a$ is the zero vector.