Apply the Implicit Function Theorem to find a root of polynomial

derivativesimplicit-function-theoremreal-analysis

Caculate the value of the real solution of the equation $x^7+0.99x-2.03$, and give a estimate for the error.

The hint is: use the Implicit Function Theorem. I dont know how to use the IFT in this case, I'm not familiarized with this.

I think in construct a function $F:\mathbb{R}^n \times \mathbb{R} \to \mathbb{R}$ with some parameters of which one is the root. Maybe
$$F(c_1,c_2,c_3,x) = c_{1}x^7 + c_{2}x – c_{3}.$$
But I'm note sure about this. Can someone help me?

Best Answer

Let $F(x,y,z)=x^7+y\,x-z$; then $F(1,1,2)=0$. We have $$ \frac{\partial F}{\partial x}=7\,x^6+y\implies\frac{\partial F}{\partial x}(1,1,2)\ne0. $$ By the IFT, you can solve for $x$ in the equation $F(x,y,z)=0$ on a neighborhood of $(1,1,2)$. That is, there is a $C^1$ function $\phi(y,z)$ such that $\phi(1,2)=1$ and $F(\phi(y,z),y,z)=0$. What you want now is $\phi(0.99,2.03)$. You cannot obtain an exact formula, but you can find an approximation: $$ \phi(0.99,2.03)\approx\phi(1,2)+\frac{\partial \phi}{\partial y}(1,2)(.99-1)+\frac{\partial \phi}{\partial z}(1,2)(2.03-2). $$ You can find the values of the partial derivatives of $\phi$ from the IFT.

Related Solutions

Functional Analysis – Differentiability of Implicit Function in Banach Spaces

Most of the treatments of the inverse function theorem or the implicit function theorem are based on finding fixed points of a contraction in Banach spaces. Smoothness of the implicit solution $y\mapsto g(y)$ to the equation $F(g(y),y)=0$ follows from the smoothness of the inverse map $A\mapsto A^{-1}$ (defined on the set of invertible bounded operators on a Banach space $X$) and the smoothness of $F$. The main trick resides in finding good uniform bounds (via the mean value theorem), or by constructing uniform contractions. Here is a sketch of how one may proceed:

Definition Let $U$ and $V$ be open subsets of Banach spaces $X$ and $Y$ respectively. A function $F:\overline{U}\times V\longrightarrow \overline{U}$ is a uniform contraction if there exists $0\leq\theta<1$ such that \begin{align} |F(x,y)-F(x',y)|\leq \theta|x-x'| \qquad x,\,x'\in \overline{U},\, y\in V.\tag{0}\label{unif_contrac} \end{align}

The following theorem shows that fixed point of a uniform contraction $F$ is as smooth as the function $F$.

Theorem (Uniform contraction principle): Suppose $W$ and $V$ are closed and open subsets of Banach spaces $X$ and $Y$ respectively. Let $F:W\times V\longrightarrow W$ be a uniform contraction and let $x_*(y)$ be the unique fixed point of $F(\cdot,y):W\longrightarrow W$.

If $F\in\mathcal{C}(W\times V,X)$, then $x_*\in \mathcal{C}(V,X)$.

Suppose $W=\overline{U}$ where $U$ is an open subset of $X$ and that $F(\overline{U}\times V)\subset U$.

If $F\in\mathcal{C}(\overline{U}\times V,X)$ and $ F\in\mathcal{C}^r(U\times V,X)$ ($r\geq1$), then $x_*\in \mathcal{C}^r(V,X)$, for each $y\in V$ the linear functional $I-\partial_x F(x_*(y),y)\in L(X)$ has a bounded inverse, and \begin{align} x_*'(y)=\Big(I-\partial_xF(x_*(y),y)\Big)^{-1}\partial_yF(x_*(y),y),\quad y\in V\tag{1}\label{smooth-fixedpoint} \end{align}

A proof of this result is at the end of this posting. Having the uniform contraction principle at our disposal we can prove establish the following result:

Theorem (Implicit function theorem): Let $X$, $Y$ and $Z$ be Banach spaces, $\Omega\subset X\times Y$ open and $F\in \mathcal{C}^r(\Omega,Z)$ for some $r\geq0$. When $r=0$ assume that $\partial_xF\in\mathcal{C}(\Omega)$. If $\partial_xF(x_0,y_0)\in\mathcal{L}(X,Z)$ has a bounded inverse for some $(x_0,y_0)\in\Omega$, then there is an open neighborhood $U\times V\subset\Omega$ of $(x_0,y_0)$ and a unique function $g:V\longrightarrow U$ such that \begin{align} g(y_0)=x_0,\qquad F(g(y),y)=F(x_0,y_0). \end{align} Moreover, $g\in\mathcal{C}^r(V,X)$ and if $r\geq1$, then for every $y\in V$ the linear operator $\partial_xF(g(y),y)\in L(X,Z)$ has a bounded inverse, and \begin{align} g'(y)=-\big(\partial_xF(g(y),y)\big)^{-1}\partial_yF(g(y),y),\qquad y\in V.\tag{2}\label{imp_f_deriv} \end{align}

Proof of the implicit function theorem: Define $G:\Omega\longrightarrow X$ by \begin{align} G(x,y)=x-\big(\partial_xF(x_0,y_0)\big)^{-1} (F(x,y)-F(x_0,y_0)) \end{align} Observe that $G$ has the same smoothness as $F$; moreover, $x-G(x,y)=0$ iff $F(x,y)=F(x_0,y_0)$. Since $\partial_xG(x_0,y_0)=0$, for any $0<\theta<1$ there exists open balls $U$ and $V_1$ around $x_0$ and $y_0$ respectively, such that $\overline{U}\times \overline{V_1}\subset\Omega$ and $\sup_{(x,y)\in \overline{U}\times V_1}\|\partial_xG(x,y)\|\leq \theta<1$. The mean value theorem implies that \begin{align} \|G(x,y)-G(x',y)\|\leq\theta\|x-x'\|,\qquad x,\, x'\in \overline{U},\quad y\in V_1 \end{align} Let $\delta=\text{rad}(U)$. Since $F$ in continuous on $U\times V_1$ and \begin{align} \|G(x_0,y)-x_0\|\leq\|\big(\partial_xF(x_0,y_0)\big)^{-1}\| \|F(x_0,y)-F(x_0,y_0)\|, \end{align} there is an open ball $V\subset V_1$ around $y_0$ such that $\|G(x_0,y)-x_0\|<(1-\theta)\delta$. Hence, \begin{align} \|G(x,y)-x_0\|\leq \|G(x,y)-G(x_0,y)\|+\|G(x_0,y)-y_0\|<\delta \end{align} for all $x\in \overline{U}$ and $y\in V$. This shows that $G:\overline{U}\times V\longrightarrow U$ is a uniform contraction with $G\in\mathcal{C}^r(U\times V,X)$. By the uniform contraction principle, for each $y\in V$ there is a unique $g(y)\in U$ such that $F(g(y),y)=F(x_0,y_0)$; moreover, $g\in\mathcal{C}^r(V,X)$ and, if $r\geq 1$, \begin{align} g'(y)=\big(I-\partial_xG(g(y),y)\big)^{-1}\partial_yG(g(y),y)= -\big(\partial_xF(g(y),y)\big)^{-1}\partial_yF(g(y),y) \end{align} for all $y\in V\qquad \Box.$

The inverse function theorem can be obtained as an application of the implicit function theorem.

Theorem(Inverse Function Theorem) Let $X$, $Y$ be Banach spaces, $W\subset X$ open, and let $f\in\mathcal{C}^r(W,Y)$, $r\geq1$. If $f'(x_0)$ has a bounded inverse for some $x_0\in W$, then there exists an open set $U\subset W$ containing $x_0$ such that $f(U)$ is open, $f:U\longrightarrow f(U)$ is bijective, the inverse function $g=f^{-1}\in \mathcal{C}^r(f(U),X)$, and \begin{align} g'(y)=\big(f'(g(y)\big)^{-1}, \qquad y\in f(U)\tag{3}\label{inv_f_deriv}. \end{align}

Proof inverse function theorem: Applying the implicit function theorem to $F(x,y)=y-f(x)$ yields neighborhoods $U'\subset W$ and $V\subset Y$ around $x_0$ and $y_0=f(x_0)$ respectively, such that for each $y\in V$, there exists a unique $g(y)\in U'$ satisfying $y=f(g(y))$. Moreover, the relation $g:y\mapsto g(y)$ is necessarily in $\mathcal{C}^r(V,X)$. This uniqueness shows that $f$ is injective in $U'$.

The set $U=U'\cap f^{-1}(V)$ is an open neighborhood of $x_0$ with $V=f(U)$, and thus, $f:U\longrightarrow V$ is a bijective function whose inverse $f^{-1}=g$. Finally, the identity \ref{inv_f_deriv} follows directly from \eqref{imp_f_deriv} $\Box$.

For completeness, I add a proof of the uniform contraction principle that I have used in the past. I don't remember whether I prove it myself as an exercise or whether it came from a set of notes in a summer school, so I owe you a source, but I am sure is of common knowledge.

First, here is a useful version of the mean value theorem:

Theorem (Mean value theorem): Suppose $F\in\mathcal{C}^1(U,Y)$ where $U\subset X$ is convex. For any $\boldsymbol{x},\,\boldsymbol{y}\in U$, \begin{align} \|F(\boldsymbol{x})-F(\boldsymbol{y})\|\leq M(\boldsymbol{x},\boldsymbol{y})\,\|\boldsymbol{x}-\boldsymbol{y}\| \end{align} where $M(\boldsymbol{x},\boldsymbol{y})=\sup_{0\leq t\leq 1}\|F'(\boldsymbol{x}+t(\boldsymbol{y-x}))\|$.

Conversely, if there is $M\geq0$ such that \begin{align} \|F(\boldsymbol{x})-F(\boldsymbol{y})\|\leq M\|\boldsymbol{x-y}\|,\qquad \boldsymbol{x},\,\boldsymbol{y}\in U, \end{align} then $\sup_{\boldsymbol{x}\in U}\|F'(\boldsymbol{x})\|\leq M$.

The last part of the mean value theorem will be particularly useful in what follows.

(1) Notice that \begin{align} \|x_*(y&+h)-x_*(y)\|=\|F(x_*(y+h),y+h)-F(x_*(y),y)\|\\ &\leq \|F(x_*(y+h),y+h)-F(x_*(y),y+h))\|+\|F(x_*(y),y+h)-F(x_*(y),y)\|\\ &< \theta\|x_*(y+h)-x_*(y)\|+\|F(x_*(y),y+h)-F(x_*(y),y)\|. \end{align} The continuity of $F$ on $W\times V$ implies that \begin{align} \|x_*(y+h)-x_*(y)\|\leq \frac{1}{1-\theta}\|F(x_*(y),y+h)-F(x_*(y),y)\|\xrightarrow{h\rightarrow0}0 \end{align} Hence, $x_*\in\mathcal{C}(V,X)$.

(2) The assumption $F(\overline{U}\times V)\subset U$ implies that $x_*$ maps $V$ into $U$ since $x_*(y)=F(x_*(y),y)$. A formal application of the chain rule yields \begin{align} x'_*(y)=\partial_xF(x_*(y),y)x'_*(y)+\partial_yF(x_*(y),y)\tag{4}\label{formal_der} \end{align} at every $y\in V$ where $x_*$ is differentiable. Consider \eqref{formal_der} as a fixed point equation $T(z,y)=z$ where $T:\mathcal{L}(Y,X)\times V\rightarrow \mathcal{L}(Y,X)$ is given by \begin{align} T(z,y)=\partial_xF(x_*(y),y)z+\partial_yF(x_*(y),y)\tag{5}\label{fix_point_eqn} \end{align} The mean value theorem along with \eqref{unif_contrac} implies that \begin{align} \sup_{(x,y)\in U\times V}\|\partial_xF(x,y)\|\leq\theta\tag{6}\label{unif_bnd_der} \end{align} Hence $T$ is a uniform contraction and, by the first part of the proof, $T$ has a continuous fixed point $z:V\rightarrow\mathcal{L}(Y,X)$.
We will now show that $z$ is in fact the derivative of $x_*$. We fix $y\in V$, and set $B(y)=\partial_xF(x_*(y),y)$, $A(y)=\partial_yF(x_*(y),y)$. Let $h(k):=x_*(y+k)-x_*(y)$ for all $k$ small enough. The fixed point property of $x_*$ and $z$ together with the differentiability of $F$ implies that for all $k$ small enough \begin{align*} (I-B(y))(h(k)-z(y)k)&=F(x_*(y+k),y+k)-F(x_*(y),y)-B(y)h(k)-A(y)k\\ &=F(x_*(y)+h(k),y+k)-F(x_*(y),y)-B(y)h(k)-A(y)k\\ &:=P(h(k),k), \end{align*} where $\frac{\|P(h,k)\|}{\|h\|+\|k\|}\rightarrow0$ as $(h,k)\rightarrow(0,0)$. From \eqref{unif_bnd_der}, we have that $(I-B(y))\in\mathcal{L}(X)$ is an invertible operator with $(I-B(y))^{-1}\in\mathcal{L}(X)$. This shows that \begin{align} x_*(y+k)=x_*(y)+z(y)k+r(k) \end{align} where $r(k)=o(k)$ as $k\rightarrow0$

For $r>1$, the result follows by induction. Suppose the result holds for $r-1$. Then, at least $x\in\mathcal{C}^{r-1}(V,X)$. The fact that $x_*$ satisfies \eqref{formal_der} implies that \begin{align} x_*'(y)=\big(I-\partial_xF(x_*(y),y)\big)^{-1}\partial_yF(x_*(y),y) \tag{7}\label{deriv_implicit} \end{align} Since the map $T\mapsto T^{-1}$ from $GL(X)$ to $GL(X)$ is differentiable, it follows that $x_*\in\mathcal{C}^r(V)$ whenever $F\in\mathcal{C}^r(U\times V,Y)\quad\Box.$

Best Answer

Related Solutions

Functional Analysis – Differentiability of Implicit Function in Banach Spaces

Related Question