[Math] Linear Algebra – understanding the column picture

linear algebra

I'm proceeding through MIT OCW's 18.06 class in Linear Algebra, and I've reached a sticking point on the first lecture – I was wondering if someone can offer some clarification on a specific point for me.

My confusion is on the explanation behind 'how' you get away with creating a column vector out of the coefficients of the same variable in a system of equations, and then use those same coefficients to become 'motion' along a different coordinate axes.

For instance, in the system of equations below,

\begin{matrix}
1x && + && 2y && = && 2 \\
-3x && + && 4y && = && 5 \\
\end{matrix}

Professor Strang comes up with three separate column vectors.
For x, it is:
\begin{bmatrix}
1 \\
-3 \\
\end{bmatrix} and for y, it is:

\begin{bmatrix}
2 \\
4 \\
\end{bmatrix} Similarly, he derives the (2,5) vector as the answer vector which we then take linear combinations of the previous two to determine a solution – I fully grasp all the mechanics of how this works, but DO NOT understand how he can use the 'x' vector (with components 1 & -3) to draw a vector of '1' unit in the x direction and '-3' units in the y direction using nothing but coefficients that came from the x variable! This is the crux of my confusion.

I would have an easier time understanding what was going on if the actual mechanics of creating the two vectors was to take the coefficients of each equation and put those into column vectors. For instance, if it were as follows:

X = \begin{bmatrix}1 \\ 2 \\ \end{bmatrix} and Y = \begin{bmatrix}-3 \\ 4\\ \end{bmatrix}

IF it were this way, it would make sense to me because it would correspond to the 'movement' each equation produced in each coordinate system – I also realize that this 'breaks' the idea of 'x' and 'y' because in this second example (in which I realize my understanding is wrong) I've arbitrarily assigned the 'labels' x and y to the separate vectors.

Can someone offer an explanation?

Thanks,

Luke

Best Answer

Without having worked through the course myself, this appears to be more a product of confusing notation than of a deeply rooted misunderstanding.

Instead, considering the following system:

$$ 1a + 2b = 2 \\ −3a + 4b = 5 $$

The goal of creating vectors here is that we want to be able to write the set of equations above as $aA + bB = C$ for some column vectors $A, B, C$. If you think about it in those terms, then it becomes more straightforward why you need to define:

$$ A=\begin{bmatrix}1 \\ -3\end{bmatrix} B=\begin{bmatrix}2 \\ 4\end{bmatrix} C=\begin{bmatrix}2 \\ 5\end{bmatrix} $$

Now we can re-write this system as: $aA + bB = C$.

$$ \begin{bmatrix}1a \\ -3a\end{bmatrix} + \begin{bmatrix}2b \\ 4b\end{bmatrix} = \begin{bmatrix}2 \\ 5\end{bmatrix} $$

It is not so much that these vectors represent vectors in the traditional plane, but rather as a concise way to represent the system of equations. Ultimately, people will begin to express these not as equations in vectors, but as augmented matrices, which nicely summarize the system of equations.

$$ \left[\begin{array}{cc|c} 1 & 2 & 2 \\ -3 & 4 & 5 \end{array}\right] $$

It is my opinion that suspending parallels to the $\mathbb{R}^2$ plane as soon as possible when studying linear algebra tends to be a worthwhile endeavour. It can be useful when drawing parallels, but ultimately hampers the ability to think more broadly about the concepts.