How to prove that the second derivative of a function $f:M\to\mathbb{R}$ defined on a surface $M\subset\mathbb{R}^n$ is well defined

derivativesdifferential-geometrydifferential-topologymultivariable-calculusreal-analysis

QUESTION. How do I prove that the second derivative of a function $f:M\to\mathbb{R}$ defined on a surface $M\subset\mathbb{R}^n$ is well defined?

QUESTION. If, as says the user Amitai Yuval, the second derivative of a function defined on a $ M $ surface can not be well defined in general what is the explanation for this impossibility?

By a second derivative I mean an application $D^{(2)}f$ that at each point $ p \in M $ associates a well-defined bilinear aplication
$$
D^{(2)}f(p):T_pM \times T_pM \to\mathbb{R}.
$$

Let me make a background in a few steps to try to make the question more precise.

Step one. Here $ M $ designates a $C^k$ surface of dimension $ m $ within the Euclidean space $ \mathbb{R}^n$, i.e. a set in $\mathbb{R}^n$ which satisfies the following properties:

  1. there is a family of open sets $\{O_i\}_{i\in I}$ such that $M\subset \bigcup_{i\in I} O_i$
  2. for any $U=M\cap O_i$ there are a open set $U_0$ in $\mathbb{R}^m$ and a $C^k$ ($k\geq 2$) parametrization $\varphi:U_0\to U$.

Step two. I'm assuming that, fixed a point $ p \in M $, for any two parametrizations $\varphi:U_0\to U\subset M$ and $\psi:V_0\to V\subset M$ such that $p\in V\cap U$ it is proved that
$$
(\varphi^{-1}\circ\psi):\psi^{-1}(U\cap V)\to \varphi^{-1}(U\cap V)
$$

and
$$
( \psi^{-1}\circ \varphi):\varphi^{-1}(U\cap V)\to \psi^{-1}(U\cap V)
$$

are $C^k$ ($k\geq 2$) diffeormorphisms.

Step three. I am also assuming that the tangent plane $T_pM$ is the well defined vector space given by any parameterization $\varphi:U_0\to U\subset M$ such that $\varphi(a)=p$ as
$$
T_pM= D\varphi(a)(\mathbb{R}^m).
$$

By well defined I mean $ D\varphi(a)(\mathbb{R}^m)=D\psi(b)(\mathbb{R}^m)$ for any other parameterization $\psi:V_0\to V\subset M$ such that $\psi(b)=p$.

Step four. A function $ f: M \to \mathbb{R}$ is $C^r$ ($1\leq r<k$) differentiable at point $p$ if there is a parameterization $\varphi:U_0\to U\subset M$ with $\varphi(a)=p$ such that $f\circ\varphi$ is $C^r$ differentiable in $a$. Once the parameter change $(\varphi^{-1}\circ\psi):\psi^{-1}(U\cap V)\to \varphi^{-1}(U\cap V)$ is differentiable it follows that the application $f\circ\psi$ is differentiable in $b$ for all parameterization $\psi:V_0\to V\subset M$ such that $\psi(b)=p$.

Step five If $ f:M\to \mathbb{R} $ is $C^r$ $(r>1)$ differentiable at point $ p \in M$ then its derivative at that point is the linear transformation $ Df(p):T_pM\to\mathbb{R} $ defined as follows. Let's take a parameterization $\varphi:U_0\to U\subset M$ with $\varphi(a)=p$. Given a vector $u\in T_p M $ there exists unique a vector $ \mu \in \mathbb{R}^m$ such that $u=Df(a)\mu$.The derivative of $ f $ at point $ p $ is then simply defined by
$$
Df(p)\cdot u =D (f\circ \varphi)(a)\cdot\mu
$$

Step six. The linear transformation of step five is well defined. That is, if $\psi:V_0\to V$ is any other parameterization with $\psi(b)=p$ and $u=D\psi(b)\zeta$ for some vector $\zeta\in\mathbb{R}^m$, then
$$
D (f\circ \varphi)(a)\cdot\mu= D (f\circ \psi)(b)\cdot\zeta.
$$

Indeed, $\psi=\varphi\circ(\varphi^{-1}\circ\psi)$ at where $(\varphi^{-1}\circ\psi):\psi^{-1}(U\cap V)\to \varphi^{-1}(U\cap V)$ is a $C^{k}$ diffeomorphism such that $(\varphi^{-1}\circ\psi)(b)=a$. We have
\begin{align}
D\varphi(a)\cdot \mu=& u \\
=& D\psi(b)\cdot \zeta \\
=& D((\varphi\circ \varphi^{-1})\circ\psi )(b)\cdot \zeta\\
=& D(\varphi\circ(\varphi^{-1}\circ\psi))(b)\cdot \zeta\\
=& D\varphi(a)\cdot \Big(D(\varphi^{-1}\circ\psi)(b)\cdot\zeta\Big)
\end{align}

Since $ D \varphi(a)$ is injective we have $\mu=D(\varphi^{-1}\circ\psi)(b)\cdot\zeta$.
Therefore,
\begin{align}
D(f\circ\psi)(b)\zeta =& D(f\circ(\varphi\circ\varphi^{-1})\circ\psi)(b)\cdot\zeta \\
=& D(f\circ \varphi\circ(\varphi^{-1}\circ\psi))(b)\cdot\zeta \\
=& D(f\circ \varphi)(a) \cdot\Big( D(\varphi^{-1}\circ\psi)(b)\cdot\zeta\Big) \\
=& D(f\circ \varphi)(a) \cdot\mu
\end{align}

Let us end the background here and return to the question. Below is my attempt to answer the question.

By analogy to the first order derivative the second derivative would work as follows.

DEFINITION. Let $M$ a $C^k$ ( $k>2$ ) surface. Let $ f:M\to \mathbb{R} $ is $C^2$ differentiable at point $ p \in M$. Its second derivative $D^{(2)}f$ associates each point $p$ the bilinear transformation
$$
D^{(2)}f(p):T_pM\times T_pM\to\mathbb{R}
$$

such that for all parameterization $\varphi:U_0\to U\subset M$, with $\varphi(a)=p$, and all vectors $u,v\in T_p M $
$$
D^{(2)}f(p)\cdot (u,v) \mathop{=}^{_{\rm def}}D^{(2)} (f\circ \varphi)(a)\cdot(\mu,\nu)
$$

with $ \mu,\nu \in \mathbb{R}^m$ such that
$
u=D\varphi(a)\mu \quad \mbox{ and }\quad v=D\varphi(a)\nu.
$

Similarly to the case of the first derivative, the second derivative will be well defined if for any other parameterization $\psi:V_0\to V$ such that $\psi(b)=p$,
$$
D\psi(b)\eta=u \quad \mbox{ and } \quad D\psi(b)\zeta=v
$$

for some $\eta\in\mathbb{R}^m$ and for some $\zeta\in\mathbb{R}^m$ we have
$$
D^{(2)} (f\circ \varphi)(a)\cdot(\mu,\nu)=D^{(2)} (f\circ \psi)(b)\cdot(\eta,\zeta)
$$

To show the above equality I tried to imitate step six. We have
\begin{align}
D^{(2)} (f\circ \psi)(b)\cdot(\eta,\zeta)
=&
D^{(2)} (f\circ(\varphi\circ\varphi^{-1})\circ \psi)(b)\cdot(\eta,\zeta)
\\
=&
D^{(2)} (f\circ\varphi\circ(\varphi^{-1}\circ \psi))(b)\cdot(\eta,\zeta)
\end{align}

But in this last equality I can not use the chain rule.

On the other hand the development of the second derivative $D^{(2)} (f\circ \varphi)(a)\cdot(\mu,\nu)$ in terms of the partial derivatives $\frac{\partial^{2}f}{\partial x_i\partial x_j}(a)$ was more productive. Look
\begin{align}
D^{(2)} (f\circ \varphi)(a)\cdot(\mu,\nu)
=&
\left[
\frac{\partial}{\partial \nu}
\left(
\frac{\partial}{\partial \mu} (f\circ\varphi)
\right)
\right](a)
\\
=&
\left[
\frac{\partial}{\partial \nu}
\left(
\sum_{j=1}^{m}\mu_j\cdot \frac{\partial}{\partial x_j} (f\circ\varphi)
\right)
\right](a)
\\
=&
\sum_{i=1}^m\sum_{j=1}^{m}
\nu_i\cdot\mu_j\cdot \frac{\partial^2}{\partial x_i \partial x_j} (f\circ\varphi)(a)
\\
\end{align}

By $\frac{\partial}{\partial \mu} (f\circ\varphi)(a)= D(f\circ \varphi)(a) \cdot\mu =u= D(f\circ\psi)(b)\eta= \frac{\partial}{\partial \eta} (f\circ\psi)(b)$ we have
\begin{align}
D^{(2)} (f\circ \varphi)(a)\cdot(\mu,\nu)
=&
\left[
\frac{\partial}{\partial \nu}
\left(
\frac{\partial}{\partial \mu} (f\circ\varphi)
\right)
\right](a)
\\
=&
\left[
\frac{\partial}{\partial \nu}
\left(
\frac{\partial}{\partial \eta} (f\circ\psi)
\right)
\right](b)
\\
=&
\left[
\frac{\partial}{\partial \nu}
\left(
\sum_{j=1}^{m}\eta_j\cdot \frac{\partial}{\partial x_j} (f\circ\psi)
\right)
\right](b)
\\
=&
\sum_{i=1}^m\sum_{j=1}^{m}
\nu_i\cdot\eta_j\cdot \frac{\partial^2}{\partial x_i \partial x_j} (f\circ\psi)(b)
\\
=&
D^{(2)}(f\circ\psi)(a)\cdot(\eta,\nu)
\end{align}

By an entirely analogous calculation and applying the Schwarz theorem we have
$$
D^{(2)}(f\circ\varphi)(a)\cdot(\mu,\nu)= D^{(2)}(f\circ\psi)(a)\cdot(\mu,\zeta).
$$

QUESTION. The question then becomes the following. How to use the identities below

\begin{align}
D^{(2)} (f\circ \varphi)(a)\cdot(\mu,\nu)=&D^{(2)}(f\circ\psi)(a)\cdot(\eta,\nu)
\\
D^{(2)}(f\circ\varphi)(a)\cdot(\mu,\nu)=&D^{(2)}(f\circ\psi)(a)\cdot(\mu,\zeta).
\end{align}

to prove the well definiteness of $ D^{(2)}f$?

Best Answer

Below is an example showing that the second derivative is not well-defined in the sense you are asking about. The example is followed by a short theoretical discussion.

Example: Let $S^1$ denote the unit circle in $\mathbb{R}^2$. Consider the following two different local parametrizations of $S^1$ around the point $p=\left(\sqrt{\frac{1}{2}},\sqrt{\frac{1}{2}}\right)$: $$\varphi:(-1,1)\to S^1,\qquad t\mapsto\left(\sqrt{1-t^2},t\right)$$ and $$\psi:\left(-\frac{\pi}{2\sqrt{2}},\frac{\pi}{2\sqrt{2}}\right)\to S^1,\qquad s\mapsto\left(\cos\left(\sqrt{2}s\right),\sin\left(\sqrt{2}s\right)\right).$$ Consider the tangent vector $v\in T_pS^1$ given by $$v=-\frac{\partial}{\partial x}+\frac{\partial}{\partial y}=\frac{d\varphi}{dt}\left(\sqrt{\frac{1}{2}}\right)=\frac{d\psi}{dt}\left(\frac{\pi}{4\sqrt{2}}\right).$$ Let $f:S^1\to\mathbb{R}$ be given by $(x,y)\mapsto y$. Then the composition $f\circ\varphi$ is given by $t\mapsto t$, whereas the composition $f\circ\psi$ is given by $s\mapsto\sin \left(\sqrt{2}s\right).$ Let us compute the second derivative of $f$ with respect to both parametrizations: $$\frac{d^2(f\circ\varphi)}{dt^2}\left(\sqrt{\frac{1}{2}}\right)=0,\qquad \frac{d^2(f\circ\psi)}{ds^2}\left(\frac{\pi}{4\sqrt{2}}\right)=-2\sin\left(\sqrt{2}\frac{\pi}{4\sqrt{2}}\right)=-\sqrt{2}.$$ Hence, each parametrization leads to a different result, implying that the second derivative $d^2f(v,v)$ is not well-defined.

Discussion: The second derivative is, in fact, well-defined in the following sense. If $X$ and $Y$ are two vector fields on a smooth manifold $M$, then for a smooth $f:M\to\mathbb{R}$ the second derivative $X(Y(f))$ is a perfectly legit, coordinate free expression. Indeed, the first derivative $Y(f)$ is well-defined at every point and thus defines a new function. Then, the derivative $X(Y(f))$ is well-defined as the (first) derivative of a well-defined function. The question at hand could be posed as follows:

Question: Let $X,X',Y,Y'$ be vector fields on a smooth manifold $M$, and let $p\in M$ such that $X(p)=X'(p)$ and $Y(p)=Y'(p)$. Let $f:M\to\mathbb{R}$ be smooth (or at least twice differentiable). Do we have $$X(Y(f))(p)=X'(Y'(f))(p)?$$

The answer to the question is no, but let us see why. First, note that replacing $X$ by $X'$ does not affect the result. Indeed, the first derivative $Y(f)$ is a well-defined function, and so, the second derivative $X(Y(f))(p)$ only depends on the value of $X$ at $p$. On the other hand, replacing $Y$ by $Y'$ does affect the result. Let us calculate: \begin{align}X(Y(f))(p)-X(Y'(f))(p)&=X\left(Y(f)-Y'(f)\right)(p)\\&=X((Y-Y')(f))(p).\end{align} Now, since $Y(p)=Y'(p)$, we have $(Y-Y')(f)(p)=0$. But, in general, we have $(Y-Y')(f)(q)\ne0$ for $q\ne p$. Hence, the derivative $X((Y-Y')(f))(p)$ is just the directional derivative of a function which vanishes at $p$. There is no reason to expect such a derivative to vanish as well.

Related Question