Functional Analysis – Differentiability of Implicit Function in Banach Spaces

analysisfunctional-analysisimplicit-function-theoremnonlinear-analysis

I'm looking at the classical implicit function theorem in Banach spaces. So $X,Y,Z$ are Banach spaces and $F: U_{x_0}\times V_{y_0} \to Z$ continuous and continuously differentiable with respect to y. And the inverse linear operator of the Frechet partial derivative is bounded linear operator. Also $F(x_0,y_0)=O.$
Then locally there is unique implicit function $T$ which $F(x,Tx)=O.$ This $T$ is continuous.

However if we assume additional smoothness condition for $F,$ then $T$ also has it.
Ofcouse formal differentiation of $F(x,Tx) = O$ gets us that if $T$ is Frechet differentiable, then $T'x = – F'_y(x,Tx)^{-1} F'_{x}(x,Tx)$ possibly on smaller ball.
So If I assume additionally that $F$ is continuously differentiable, then im trying to prove that indeed $T$ is differentiable by trying to see that $||\omega(x,h)||:=||T(x+h)-Tx+F'_y(x,Tx)^{-1} F'_{x}(x,Tx)h|| = o(||h||)$ as $h \to O.$ And so trying to get to $||\omega(x,h)||\leqq c \varepsilon ||h||.$

But the bound I'm currently getting is something along those lines: ($\overline{x}:=x+h, y:=Tx, \overline{y}:=T(x+h))$

\begin{align*}
||\omega(x,h)||&= ||-F'_y (x,y)^{-1}||_{L(Z,Y)}.||-F'_y(x,y)(\overline{y}-y)-F'_x(x,y)h|| \\
&\leq c_1 ||F(x,y)-F(x,\overline{y})+\nu(x,\overline{y}-y)+F(x,y)-F(\overline{x},y)+\mu(h,x)|| \\
&\leq c_1 ( \varepsilon ||\overline{y}-y|| + \varepsilon ||h|| + ||F(x,\overline{y})-F(x,y)+F(\overline{x},y)-F(x,y)||),
\end{align*}

provided that $||h|| < \delta(\varepsilon), ||\overline{y}-\overline{y}|| < \delta(\varepsilon).$
Where $c_1:=||-F'_y (x,y)^{-1}||_{L(Z,Y)}.$

Now the last term in the norm I evaluate the same way with the partial Frechet derivatives
\begin{align*}
||F(x,\overline{y})-F(x,y)+F(\overline{x},y)-F(x,y)|| \leqq ||F'_y(x,y)||. ||\overline{y} – y|| +\varepsilon ||\overline{y}-y|| +||F'_x(x,y)||.||h||+\varepsilon .||h||.
\end{align*}

Denoting $c_2:=||F'_y(x,y)||_{L(Y,Z)}$ and $c_3:=||F'_x(x,y)||_{L(X,Z)}.$ This then reads
$||F(x,\overline{y})-F(x,y)+F(\overline{x},y)-F(x,y)|| \leqq (c_2 + \varepsilon)||\overline{y}-y|| + (c_3 +\varepsilon)||h||$ and in substituing in the inequality for $||\omega(x,h)||$ we get
$$
|| \omega(x,h)|| \leqq c_1 (2\varepsilon+c_2)||\overline{y}-y||+(2\varepsilon+c_3)||h||.
$$

Bound for $||\overline{y}-y||$ will get with this inequality and combined with reversed triangle inequality for the definition of omega :
$$
| ||\overline{y}-y|| – ||-F'_y(x,y)^{-1}F'_x(x,y)||.||h|| | \leq ||\omega(x,h)||\leqq c_1 [(2\varepsilon+c_2)||\overline{y}-y||+(2\varepsilon+c_3)||h||].
$$

Therefore denoting $c_4:=||-F'_y(x,y)^{-1}F'_x(x,y)||=c_1.c_3$ we get $(1-c_1(2\varepsilon+c_2))||\overline{y}-y|| \leq (c_1 c_3 +c_1 (2\varepsilon +c_2))||h||.$
Thus getting the following absurd (useless)bound
$$
||\omega(x,h)|| \leqq \left[ (2\varepsilon+c_3) + \frac{c_1(2\varepsilon+c_2)}{1-c_1(2\varepsilon+c_2)} [c_1 c_3 + c_1(2\varepsilon + c_2)] \right] ||h||.
$$

The problem with this bound is that it must be multiple of $\varepsilon$ and $||h||$ in order to get $o(||h||).$
Any suggestions on how to get a better bound will be much appreciated.

For reference: I tried to escape the argument in Deimling's book, where implicit function theorem was proven (without the assertion for differentiability of the implicit function) and then inverse function theorem was proven, and differentiability of the inverse function was established quoite easy.

After that Mr. Deimling proves differentiability of the implicit function using the inverse function theorem;
There he assumes $F is C^m$ and considers the following map: $G(x,y):=(x, F'_y(x_0,y_0)^{-1}F(x,y)).$ And claims that since
$$
G'(x_0,y_0)(h,k) = (h, k+ F'_y(x_0,y_0)^{-1}F'_x(x_0,y_0)h),
$$

$G'(x_0,y_0)$ must be a homeomorphism (which i dont see how its automatically true)

And then $G^{-1}(x,O)=(x,Tx)$ with $T$ exactly the implicit function we are discussing. And now its clear how differentiability (and $C^m$) in inverse function theorem establishes it in the implicit function theorem.

So additional question: Is it obvious that $G'(x_0,y_0) = (Id_X, Id_Y +F'_y(x_0,y_0)^{-1}F'_x(x_0,y_0))$ is homeomorphism?

Update It is obvious… since the right coordinate is translation with linear bounded operator the inverse will be sign-conjugated one:
$$G'(x_0,y_0)^{-1}=(Id_X, Id_Y – F'_y(x_0,y_0)^{-1}F'_x(x_0,y_0)).$$
Now it's clear how the rest follows.

Best Answer

Most of the treatments of the inverse function theorem or the implicit function theorem are based on finding fixed points of a contraction in Banach spaces. Smoothness of the implicit solution $y\mapsto g(y)$ to the equation $F(g(y),y)=0$ follows from the smoothness of the inverse map $A\mapsto A^{-1}$ (defined on the set of invertible bounded operators on a Banach space $X$) and the smoothness of $F$. The main trick resides in finding good uniform bounds (via the mean value theorem), or by constructing uniform contractions. Here is a sketch of how one may proceed:

Definition Let $U$ and $V$ be open subsets of Banach spaces $X$ and $Y$ respectively. A function $F:\overline{U}\times V\longrightarrow \overline{U}$ is a uniform contraction if there exists $0\leq\theta<1$ such that \begin{align} |F(x,y)-F(x',y)|\leq \theta|x-x'| \qquad x,\,x'\in \overline{U},\, y\in V.\tag{0}\label{unif_contrac} \end{align}

The following theorem shows that fixed point of a uniform contraction $F$ is as smooth as the function $F$.

Theorem (Uniform contraction principle): Suppose $W$ and $V$ are closed and open subsets of Banach spaces $X$ and $Y$ respectively. Let $F:W\times V\longrightarrow W$ be a uniform contraction and let $x_*(y)$ be the unique fixed point of $F(\cdot,y):W\longrightarrow W$.

  1. If $F\in\mathcal{C}(W\times V,X)$, then $x_*\in \mathcal{C}(V,X)$.

Suppose $W=\overline{U}$ where $U$ is an open subset of $X$ and that $F(\overline{U}\times V)\subset U$.

  1. If $F\in\mathcal{C}(\overline{U}\times V,X)$ and $ F\in\mathcal{C}^r(U\times V,X)$ ($r\geq1$), then $x_*\in \mathcal{C}^r(V,X)$, for each $y\in V$ the linear functional $I-\partial_x F(x_*(y),y)\in L(X)$ has a bounded inverse, and \begin{align} x_*'(y)=\Big(I-\partial_xF(x_*(y),y)\Big)^{-1}\partial_yF(x_*(y),y),\quad y\in V\tag{1}\label{smooth-fixedpoint} \end{align}

A proof of this result is at the end of this posting. Having the uniform contraction principle at our disposal we can prove establish the following result:

Theorem (Implicit function theorem): Let $X$, $Y$ and $Z$ be Banach spaces, $\Omega\subset X\times Y$ open and $F\in \mathcal{C}^r(\Omega,Z)$ for some $r\geq0$. When $r=0$ assume that $\partial_xF\in\mathcal{C}(\Omega)$. If $\partial_xF(x_0,y_0)\in\mathcal{L}(X,Z)$ has a bounded inverse for some $(x_0,y_0)\in\Omega$, then there is an open neighborhood $U\times V\subset\Omega$ of $(x_0,y_0)$ and a unique function $g:V\longrightarrow U$ such that \begin{align} g(y_0)=x_0,\qquad F(g(y),y)=F(x_0,y_0). \end{align} Moreover, $g\in\mathcal{C}^r(V,X)$ and if $r\geq1$, then for every $y\in V$ the linear operator $\partial_xF(g(y),y)\in L(X,Z)$ has a bounded inverse, and \begin{align} g'(y)=-\big(\partial_xF(g(y),y)\big)^{-1}\partial_yF(g(y),y),\qquad y\in V.\tag{2}\label{imp_f_deriv} \end{align}

Proof of the implicit function theorem: Define $G:\Omega\longrightarrow X$ by \begin{align} G(x,y)=x-\big(\partial_xF(x_0,y_0)\big)^{-1} (F(x,y)-F(x_0,y_0)) \end{align} Observe that $G$ has the same smoothness as $F$; moreover, $x-G(x,y)=0$ iff $F(x,y)=F(x_0,y_0)$. Since $\partial_xG(x_0,y_0)=0$, for any $0<\theta<1$ there exists open balls $U$ and $V_1$ around $x_0$ and $y_0$ respectively, such that $\overline{U}\times \overline{V_1}\subset\Omega$ and $\sup_{(x,y)\in \overline{U}\times V_1}\|\partial_xG(x,y)\|\leq \theta<1$. The mean value theorem implies that \begin{align} \|G(x,y)-G(x',y)\|\leq\theta\|x-x'\|,\qquad x,\, x'\in \overline{U},\quad y\in V_1 \end{align} Let $\delta=\text{rad}(U)$. Since $F$ in continuous on $U\times V_1$ and \begin{align} \|G(x_0,y)-x_0\|\leq\|\big(\partial_xF(x_0,y_0)\big)^{-1}\| \|F(x_0,y)-F(x_0,y_0)\|, \end{align} there is an open ball $V\subset V_1$ around $y_0$ such that $\|G(x_0,y)-x_0\|<(1-\theta)\delta$. Hence, \begin{align} \|G(x,y)-x_0\|\leq \|G(x,y)-G(x_0,y)\|+\|G(x_0,y)-y_0\|<\delta \end{align} for all $x\in \overline{U}$ and $y\in V$. This shows that $G:\overline{U}\times V\longrightarrow U$ is a uniform contraction with $G\in\mathcal{C}^r(U\times V,X)$. By the uniform contraction principle, for each $y\in V$ there is a unique $g(y)\in U$ such that $F(g(y),y)=F(x_0,y_0)$; moreover, $g\in\mathcal{C}^r(V,X)$ and, if $r\geq 1$, \begin{align} g'(y)=\big(I-\partial_xG(g(y),y)\big)^{-1}\partial_yG(g(y),y)= -\big(\partial_xF(g(y),y)\big)^{-1}\partial_yF(g(y),y) \end{align} for all $y\in V\qquad \Box.$

The inverse function theorem can be obtained as an application of the implicit function theorem.

Theorem(Inverse Function Theorem) Let $X$, $Y$ be Banach spaces, $W\subset X$ open, and let $f\in\mathcal{C}^r(W,Y)$, $r\geq1$. If $f'(x_0)$ has a bounded inverse for some $x_0\in W$, then there exists an open set $U\subset W$ containing $x_0$ such that $f(U)$ is open, $f:U\longrightarrow f(U)$ is bijective, the inverse function $g=f^{-1}\in \mathcal{C}^r(f(U),X)$, and \begin{align} g'(y)=\big(f'(g(y)\big)^{-1}, \qquad y\in f(U)\tag{3}\label{inv_f_deriv}. \end{align}

Proof inverse function theorem: Applying the implicit function theorem to $F(x,y)=y-f(x)$ yields neighborhoods $U'\subset W$ and $V\subset Y$ around $x_0$ and $y_0=f(x_0)$ respectively, such that for each $y\in V$, there exists a unique $g(y)\in U'$ satisfying $y=f(g(y))$. Moreover, the relation $g:y\mapsto g(y)$ is necessarily in $\mathcal{C}^r(V,X)$. This uniqueness shows that $f$ is injective in $U'$.

The set $U=U'\cap f^{-1}(V)$ is an open neighborhood of $x_0$ with $V=f(U)$, and thus, $f:U\longrightarrow V$ is a bijective function whose inverse $f^{-1}=g$. Finally, the identity \ref{inv_f_deriv} follows directly from \eqref{imp_f_deriv} $\Box$.


For completeness, I add a proof of the uniform contraction principle that I have used in the past. I don't remember whether I prove it myself as an exercise or whether it came from a set of notes in a summer school, so I owe you a source, but I am sure is of common knowledge.

First, here is a useful version of the mean value theorem:

Theorem (Mean value theorem): Suppose $F\in\mathcal{C}^1(U,Y)$ where $U\subset X$ is convex. For any $\boldsymbol{x},\,\boldsymbol{y}\in U$, \begin{align} \|F(\boldsymbol{x})-F(\boldsymbol{y})\|\leq M(\boldsymbol{x},\boldsymbol{y})\,\|\boldsymbol{x}-\boldsymbol{y}\| \end{align} where $M(\boldsymbol{x},\boldsymbol{y})=\sup_{0\leq t\leq 1}\|F'(\boldsymbol{x}+t(\boldsymbol{y-x}))\|$.

Conversely, if there is $M\geq0$ such that \begin{align} \|F(\boldsymbol{x})-F(\boldsymbol{y})\|\leq M\|\boldsymbol{x-y}\|,\qquad \boldsymbol{x},\,\boldsymbol{y}\in U, \end{align} then $\sup_{\boldsymbol{x}\in U}\|F'(\boldsymbol{x})\|\leq M$.

The last part of the mean value theorem will be particularly useful in what follows.

(1) Notice that \begin{align} \|x_*(y&+h)-x_*(y)\|=\|F(x_*(y+h),y+h)-F(x_*(y),y)\|\\ &\leq \|F(x_*(y+h),y+h)-F(x_*(y),y+h))\|+\|F(x_*(y),y+h)-F(x_*(y),y)\|\\ &< \theta\|x_*(y+h)-x_*(y)\|+\|F(x_*(y),y+h)-F(x_*(y),y)\|. \end{align} The continuity of $F$ on $W\times V$ implies that \begin{align} \|x_*(y+h)-x_*(y)\|\leq \frac{1}{1-\theta}\|F(x_*(y),y+h)-F(x_*(y),y)\|\xrightarrow{h\rightarrow0}0 \end{align} Hence, $x_*\in\mathcal{C}(V,X)$.

(2) The assumption $F(\overline{U}\times V)\subset U$ implies that $x_*$ maps $V$ into $U$ since $x_*(y)=F(x_*(y),y)$. A formal application of the chain rule yields \begin{align} x'_*(y)=\partial_xF(x_*(y),y)x'_*(y)+\partial_yF(x_*(y),y)\tag{4}\label{formal_der} \end{align} at every $y\in V$ where $x_*$ is differentiable. Consider \eqref{formal_der} as a fixed point equation $T(z,y)=z$ where $T:\mathcal{L}(Y,X)\times V\rightarrow \mathcal{L}(Y,X)$ is given by \begin{align} T(z,y)=\partial_xF(x_*(y),y)z+\partial_yF(x_*(y),y)\tag{5}\label{fix_point_eqn} \end{align} The mean value theorem along with \eqref{unif_contrac} implies that \begin{align} \sup_{(x,y)\in U\times V}\|\partial_xF(x,y)\|\leq\theta\tag{6}\label{unif_bnd_der} \end{align} Hence $T$ is a uniform contraction and, by the first part of the proof, $T$ has a continuous fixed point $z:V\rightarrow\mathcal{L}(Y,X)$.
We will now show that $z$ is in fact the derivative of $x_*$. We fix $y\in V$, and set $B(y)=\partial_xF(x_*(y),y)$, $A(y)=\partial_yF(x_*(y),y)$. Let $h(k):=x_*(y+k)-x_*(y)$ for all $k$ small enough. The fixed point property of $x_*$ and $z$ together with the differentiability of $F$ implies that for all $k$ small enough \begin{align*} (I-B(y))(h(k)-z(y)k)&=F(x_*(y+k),y+k)-F(x_*(y),y)-B(y)h(k)-A(y)k\\ &=F(x_*(y)+h(k),y+k)-F(x_*(y),y)-B(y)h(k)-A(y)k\\ &:=P(h(k),k), \end{align*} where $\frac{\|P(h,k)\|}{\|h\|+\|k\|}\rightarrow0$ as $(h,k)\rightarrow(0,0)$. From \eqref{unif_bnd_der}, we have that $(I-B(y))\in\mathcal{L}(X)$ is an invertible operator with $(I-B(y))^{-1}\in\mathcal{L}(X)$. This shows that \begin{align} x_*(y+k)=x_*(y)+z(y)k+r(k) \end{align} where $r(k)=o(k)$ as $k\rightarrow0$

For $r>1$, the result follows by induction. Suppose the result holds for $r-1$. Then, at least $x\in\mathcal{C}^{r-1}(V,X)$. The fact that $x_*$ satisfies \eqref{formal_der} implies that \begin{align} x_*'(y)=\big(I-\partial_xF(x_*(y),y)\big)^{-1}\partial_yF(x_*(y),y) \tag{7}\label{deriv_implicit} \end{align} Since the map $T\mapsto T^{-1}$ from $GL(X)$ to $GL(X)$ is differentiable, it follows that $x_*\in\mathcal{C}^r(V)$ whenever $F\in\mathcal{C}^r(U\times V,Y)\quad\Box.$

Related Question