Multivariable Calculus – Row vs. Column Vector Derivative

multivariable-calculusnotation

I am reviewing calculus from Spivak's Calculus on Manifolds, and it looks like he is being (very?) cavalier with regards to when vectors should be written as columns or rows.

Let $f:\mathbf{R}^n \to \mathbf{R}^m$ be a differentiable function, and let $a \in \mathbf{R}^n$. Let $Df(a)$ denote the derivative of $f$ at $a$ (just to be clear, by the definition Spivak uses, $Df(a)$ is the linear map such that $\lim\limits_{h \to 0} \dfrac{||f(a+h)-f(a)-Df(a)(h)||}{||h||} = 0$).

Now, he says that if $f(x) = (f_1(x),\ldots,f_m(x))$ then $Df(a) = (Df_1(a),\ldots,Df_m(a))$. This seems simple enough, and the proof is very easy, but he follows it up by saying that the matrix of this map, denoted $f'(a)$, is the matrix with $f_i'(a)$ as the $i$th row.

But does this make sense? In order for the matrix to be like that, he would have had to have written $Df(a)$ as a column, no?

Best Answer

It is mostly a typographical convenience and sloppiness on the part of authors and typesetters. You do want to think about vectors as column vectors when you're doing calculus and you want to think of functions $f\colon\Bbb R^n\to\Bbb R^m$ as $$f\left(\begin{matrix}x_1\\x_2\\\vdots\\x_n\end{matrix}\right) = \begin{bmatrix} f_1(\mathbf x) \\ f_2(\mathbf x) \\ \vdots \\ f_m(\mathbf x)\end{bmatrix}\,.$$ It's particularly important to do this so that, as in your case, one doesn't get confused about rows and columns in the derivative matrix. The $i$th row of $Df(\mathbf x)$ should consist of the partial derivatives of $f_i$.

It's tedious to typeset a book like this and, honestly, it wastes a lot of room, but it saves confusion. A few rigorous multivariable calculus books I've seen (and I'm sure a few I'm not remembering) try to be careful about these, among them (1) Hubbard and Hubbard, (2) Williamson, Crowell, and Trotter, and (3) my own. (By way of a comment, I'll add that I've had complaints from a few readers of my own differential geometry notes that I reverted to standard sloppiness, and denoted vectors and functions by the more convenient $(x,y,z)$ notation.)

Related Question