Defining differentiablity of a function of two variables

definitionderivativesmultivariable-calculus

We say that the function $f(x, y)$ is differentiable at the point $(a, b)$ if, and only if,

$$\lim_{(h,k)\rightarrow(0,0)} \frac{f(a+h,b+k)-f(a,b)-hf_1(a,b)-kf_2(a,b)}{\sqrt{h^2 + k^2}}=0.$$

If I were to define differentiability of a function of two variables I would say: The function $f(x,y)$ is differentiable at the point $(a, b)$ if, and only if, the partial derivatives $f_1(a,b)$ and $f_2(a,b)$ exist.

Does my definition fail to capture some key property of differentiability in $\mathbb{R}^2$ that the other definition captures? If not, why have mathematicians chosen to define differentiablity in terms of the above limit rather than solely in terms of the existence of the partial derivatives of the multivariate function in question?

Best Answer

From single variable calculus, you should know that differentiable functions are more "well behaved" than continuous ones, in the sense that differentiable $\implies$ continuous, but not the other way around. As the other answer shows, the mere existence of partial derivatives doesn't even guarantee that the function is continuous.

Differential Calculus is (roughly speaking) a theory of "local linear approximations". What I mean by this is that if we are given a function $f$ (for example from $\Bbb{R}^2$ into $\Bbb{R}$), we would like to understand "the behaviour and properties of $f$". Often times, $f$ might be non-linear, so it is very hard to analyze it directly. So, what we could do instead is to "approximate" it by a linear function $T$. Since linear functions are easier to analyze, what we can do is study the properties of $T$, and from this, we try to infer properties about the original function $f$.

So far, the paragraph above has been more heuristic/vague rather than technical, but it contains the key idea for all of differential calculus! I'll say it once again just to really emphasise it: the main purpose of differential calculus is to analyze properties of functions by first "approximating" them by linear functions and then analyzing the simpler linear function, and then inferring properties about the original function. So, any reasonable definition of "differentiable" has to capture the idea of "approximation by a linear function". Now, I'll restate the definition of differentiability (with modified notation):

We say that a function $f: \Bbb{R}^2 \to \Bbb{R}$ is differentiable at the point $\alpha$ is there exists a linear transformation $T: \Bbb{R}^2 \to \Bbb{R}$ such that the following equation holds: \begin{align} \lim_{h \to 0} \dfrac{f(\alpha + h) - f(\alpha) - T(h)}{\lVert h \rVert} = 0 \end{align} If such a $T$ exists, then one can prove that it is unique, and we denote it by any of the symbols: $Df(\alpha)$ or $Df_{\alpha}$ of $df_{\alpha}$ (depending on who the author is).

If we define the so called "remainder/error" term $R(h) = f(\alpha + h) - f(\alpha) - df_{\alpha}(h)$, then we can rewrite the above limit as \begin{align} f(\alpha + h) - f(\alpha) = df_{\alpha}(h) + R(h) \tag{$*$} \end{align} where $\lim_{h \to 0} \dfrac{R(h)}{\lVert h \rVert} = 0$. Equation $(*)$ is very important to understand conceptually. It says that if $f$ is differentiable at a point $\alpha$ in its domain, then for small displacements $h$, the change in the function $f(\alpha + h) - f(\alpha)$ can be approximated by a linear part $df_{\alpha}(h)$, plus an error term $R(h)$, which is "small" in the sense that $\lim_{h \to 0} \dfrac{R(h)}{\lVert h \rVert} = 0$.

This definition captures the heuristic idea of "approximating a non-linear function by a linear function". (From this definition, you can show that differentiability implies continuity). This idea of "linear approximations" has proven to be a very powerful and useful idea in analysis/calculus, and it's why mathematicians have defined differentiability in the manner I have just stated. For example, in single variable calculus, you may have been asked questions like:

  • Finding equations of tangent lines of functions.
  • Estimating a certain number without calculators, for example: find the equation of the tangent line to $f(x) = \sqrt{x}$ at $x=4$. Using this, estimate the value of $\sqrt{3.9}$ without a calculator.

These questions all make use of equation $(*)$ as their guiding principle.


Now, I think it is very unfortunate that students are introduced to the idea of "partial derivatives" so early on, without being introduced to the idea of "local linear approximation" first. Doing this might make you think that partial derivatives are the "fundamental concept", when in fact they are not "fundamental" at all. The fundamental notion is that of "linear approximation" while the concept of partial derivative should only come afterwards. So, the definition you proposed completely fails to capture the idea of "approximating a non-linear function by a linear one", which is why we do not use it.

Related Question