Why we assume a vector is a column vector in linear algebra, but in a matrix, the first index is a row index

linear algebramatricesnotationvectors

Wouldn't it be more convenient to think of a matrix as a several column vectors stacked together, so the first index should be a column and then a row (sine we assume x is a column vector and x^T is a row vector)?

Somehow x is a column, but when we put two of those together and say it is now a matrix, we first index into rows. This does not make sense for me and as a programmer I see a problem with this notation convention.

Best Answer

I would speculate that the reason we index by rows, then columns, is the same reason that we - meaning we in the West specifically - read text vertically line by line, and horizontally within each line. This is most likely due to the historical fact that matrices are a Western notational invention (though of course there are historical precedents from other locales).

We treat a vector as a column matrix because we have chosen, when multiplying a vector by a matrix, to put the vector to the right of the matrix. I speculate that this is due to textual layout convenience (vertical uses space better than horizontal).

As for programming, you have my sympathy. In a computer a matrix has to be mapped to linear memory, and there are two most obvious ways to do this - row-major (successive rows are catenated linearly) and column-major (successive columns are catenated linearly). The designers of Fortran picked column-major; the designers of C and other languages picked row-major. Which means if you have to pass matrix data back and forth between C code and Fortran code (and I speak from personal experience here) you have to transpose, either psychologically (in how you nest your loops) or physically (by copying data). I've done lots of both.