Dual Vectors – Understanding Standard Convention for Denoting Dual Vectors as Row Vectors

change-of-basisconventiondual-spaceslinear algebra

I found this claim in my lecture notes on real analysis in the chapter on the gradient.

When using coordinates, the standard convention is to denote vectors
as columns, and covectors as rows.

The distinction is also essential in physics, as the two kinds of
vectors transform differerently under coordinate changes.

I think I understand what they mean with the second sentence. To see this let $e_1,…,e_n$ be and $e',…,e'$ be two bases for the vector space $V$. Given each basis we can find a basis for the dual vector space $V^*$, respectively. In particular,

$e^j(\alpha_1 e_1+…+\alpha_n e_n)=\alpha_j,j=1,…,n$ is a basis for $V^*$ constructed using the basis $e_1,…,e_n$ for $V$.

Similarly,

$(e^i)'(\beta_1 e_1'+…+\beta_n e_n')=\beta_i,i=1,…,n$ is a basis for $V^*$ constructed using the basis $e_1',…,e_n'$ for $V$.

Now let

\begin{equation*}
P =
\begin{pmatrix}
p_{1,1} & p_{1,2} & \cdots & p_{1,n} \\
p_{2,1} & p_{2,2} & \cdots & p_{2,n} \\
\vdots & \vdots & \ddots & \vdots \\
p_{m,1} & p_{m,2} & \cdots & p_{m,n}
\end{pmatrix}
\end{equation*}

be the $n \times n$ matrix for the change of basis from $e_1,…,e_n$ to $e',…,e'$ and let $\alpha$ and $\beta$ be the column vectors of coordinates for a vector $v \in V$, then

$P \alpha = \beta$ or $\beta_i=\sum \limits_{j=1}^{n}p_{ij} \cdot \alpha_j$.

This means that

$(e^i)'(v)=\sum \limits_{j=1}^{n}p_{ij} e^j(v)$.

Now consider

$v^*=\delta_1 (e^1)'+…\delta_n (e^n)'$.

Using the relation between the two bases of $v^*$ we find

$v^*=\delta_1 \sum \limits_{j=1}^{n}p_{1j} e^j+…+\delta_n \sum \limits_{j=1}^{n}p_{nj} e^j$ which can equivivalently be written as

$v^*=(\sum \limits_{j=1}^{n} \delta_i p_{i1}) e^1+…+(\sum \limits_{j=1}^{n} \delta_i p_{in}) e^n=\gamma_1 e^1+…+\gamma_n e^n$

If we write the coordinate vectors $\gamma$ and $\delta$ for $v^* \in V^*$ as row vectors, then

$\delta P=\gamma$ or $\delta=\gamma P^{-1}$, so the change of basis matrix for the dual space is the inverse of the change of basis matrix $P$ for the vector space itself.

However, I can still not see why it is convenient to denote the coordinate vectors for elements of the dual space as row vectors. Can someone explain other reasons why this is useful? Of course, in this case we can avoid transposing the matrix P before inverting (which we would have to do if we wanted to use a column vector of coordinates for the dual space as well), but that seems like a minor thing to me. I feel like it might have something to do with inner products.

Best Answer

A real $m\times n$ matrix is naturally identified with a linear map from $\Bbb R^n \to \Bbb R^m$. We identify elements of $\Bbb R^n$ as column vectors ($n \times 1$ matrices) mostly because of the longstanding notational tradition of "operator on left, argument on right": $f(x)$ not $(x)f$. Because of how matrix multiplication is defined, matrices multiply column vectors on the left and row vectors on the right.

Now a dual vector is a linear functional on $\Bbb R^m$, which is by definition a linear map from $\Bbb R^m \to \Bbb R$, which means it is naturally associated with an $1 \times m$ matrix. I.e., a row vector.

One could choose to represent $\Bbb R^n$ as row vectors instead (either redefining matrix multiplication, or just acknowledging that writing $vM$ really isn't that big of a deal). But if you do, you will find that dual vectors are naturally column vectors. It is the representation of linear maps as matrices that forces this.

Related Question