[Math] How is the Jacobian the derivative if they have different dimensions

derivativesmatrix-calculusmultivariable-calculus

I understand the proof that the Jacobian matrix is the best linear approximation of a function in the limit. But since it's a matrix it has different dimensions than the original function, shouldn'the it be a vector too?
Given the Jacobian how would I find or plot where it takes some given 'input'? Since the derivative is a function I suppose one could transform it into something like f'(x1,…..,×n)=(y1,…..,yn)

I tried to be as clear as possible but I don't think I put it well into words.


EDIT: Clarifiyng, my question is how do I find, explicitly, an arbitray element of image of the derivative. Does it need two vectors/points to be defined? If so, why not only one point as in single variable calculus?

Best Answer

There’s a subtle misunderstanding in your question that might be the source of some confusion. The Jacobian matrix (i.e., the differential) isn’t the best linear approximation to a function near a point $\mathbf x_0$. It’s the best linear approximation to the change in the function’s value near the point. That is, $$\Delta f_{\mathbf x_0}=f(\mathbf x_0+\Delta\mathbf x)-f(\mathbf x_0)=\operatorname{d}f_{\mathbf x_0}[\Delta\mathbf x]+o(\|\Delta \mathbf x\|).$$ Observe that $\operatorname{d}f_{\mathbf x_0}$ operates on a displacement $\Delta\mathbf x$ (technically, on a vector in the tangent space at $\mathbf x_0$), not on an element of $f$’s domain. If $f:\mathbb R^n\to\mathbb R^m$, then $\operatorname{d}f:\mathbb R^n\to\mathbb R^m$, and in matrix terms this linear transformation is represented by multiplication by the $m\times n$ matrix $\mathbf{Jac}_f(\mathbf{x_0})$, i.e., $$\operatorname{d}f_{\mathbf x_0}:\mathbf h\mapsto\mathbf{Jac}_f(\mathbf{x_0})\,\mathbf h,$$ where $\mathbf h$ is the $1\times n$ column vector corresponding to $\Delta\mathbf x$. One way to think of $\operatorname{d}f$ is as a rule that assigns an $m\times n$ matrix, $\mathbf{Jac}_f(\mathbf{x})$, to each point $\mathbf x$ of the domain of $f$. The matrix can vary from point to point, of course.

As a concrete example, take $f:(x,y)\mapsto x^2-y^2$ as in your comments to another answer. The Jacobian at the point $(x_0,y_0)$ is the $1\times 2$ matrix $\begin{bmatrix}2x_0&-2y_0\end{bmatrix}$, so the best linear approximation to $f$ at $(x_0,y_0)$ is $$\begin{align}f(x,y)&\approx f(x_0,y_0)+\begin{bmatrix}2x_0&-2y_0\end{bmatrix}\begin{bmatrix}x-x_0\\y-y_0\end{bmatrix} \\ &= x_0^2-y_0^2+2x_0(x-x_0)-2y_0(y-y_0).\end{align}$$ To illustrate this with some concrete values, $$f(3.2,-5.1)\approx3^2-(-5)^2+\begin{bmatrix}6&10\end{bmatrix}\begin{bmatrix}0.2&-0.1\end{bmatrix}^T=9-25+0.2=-15.8.$$ The actual value is $-15.77$.

A somewhat more interesting example is $f:\begin{bmatrix}x\\y\\z\end{bmatrix}\mapsto\begin{bmatrix}x^2-y^2+3z\\2xy\end{bmatrix}$. The Jacobian is $\begin{bmatrix}2x&-2y&3\\2y&2x&0\end{bmatrix}$, and so $$f(x,y,z)\approx\begin{bmatrix}x_0^2-y_0^2+3z_0\\2x_0y_0\end{bmatrix}+\begin{bmatrix}2x_0&-2y_0&3\\2y_0&2x_0&0\end{bmatrix}\begin{bmatrix}x-x_0\\y-y_0\\z-z_0\end{bmatrix}$$ and $$f(3.2,-5.1,1.7)\approx\begin{bmatrix}3^2-(-5)^2+3\cdot1\\2\cdot3\cdot(-5)\end{bmatrix}+\begin{bmatrix}6&10&3\\-10&6&0\end{bmatrix}\begin{bmatrix}0.2\\-0.1\\0.7\end{bmatrix}=\begin{bmatrix}-13\\-30\end{bmatrix}+\begin{bmatrix}2.3\\-2.6\end{bmatrix}=\begin{bmatrix}-10.7\\-32.6\end{bmatrix}.$$ The actual value is $\begin{bmatrix}-10.67&-32.64\end{bmatrix}^T$.