[Math] Understanding the proof of independence-dimension inequality

linear algebra

I'm self-studying the book Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares

Below is an excerpt from the book:

Independence-dimension inequality. If the $n$-vectors $\vec{a_1}, \cdots, \vec{a_k}$ are linearly independent, then $k\leq n$. In words:

A linearly independent collection of $n$-vectors can have at most $n$ elements.

Put another way:

Any collection of $n+1$ or more $n$-vectors is linearly dependent.

Proof of independence-dimension inequality. The proof is by induction on the dimension $n$.First consider a linearly independent collection $\vec{a_1}, \cdots, \vec{a_k}$ of $1$-vectors. We must have $\vec{a_1}\neq 0$. This means that every element $\vec{a_i}$ of the collection can be expressed a multiple $\vec{a_i}=(\vec{a_i}/\vec{a_1})\vec{a_1}$ of the first element $\vec{a_1}$. This contradicts linear independence unless $k=1$.

Next suppose $n\geq 2$ and the independence-dimension inequality holds for dimension $n-1$. Let $\vec{a_1}, \cdots, \vec{a_k}$ be a linearly independentg list of $n$-vectors. We need to show that $k\leq n$. We partition the vectors as

$$
\vec{a}_i=
\begin{bmatrix}
\vec{b_i} \\
\alpha_i
\end{bmatrix},
\qquad i = 1, \ldots, k,
$$

where $\vec{b_i}$ is an $(n-1)$-vector and $\alpha_i$ is a scalar.

First suppose that $\alpha_1 = \cdots = \alpha_k=0$. Then the vectors $\vec{b_1}, \cdots, \vec{b_k}$ are linearly independent: $\sum_{i=1}^k\beta_i\vec{b_i}=0$ holds if and only if $\sum_{i=1}^k\beta_i\vec{a_i}=0$, which is only possible for $\beta_1 = \ldots = \beta_k =0$ because the vectors $\vec{a_i}$ are linearly independent. The vectors $\vec{b_1}, \ldots, \vec{b_k}$ form a linearly independent collection of $(n-1)$-vectors. By the induction hypothesis we have $k\leq n-1$, so certainly $k \leq n$.

Next suppose that the scalars $\alpha_i$ are not all zero. Assume $\alpha_j\neq 0$. We define a collection of $k-1$ vectors $\vec{c_i}$ of length $n-1$ as follows:

$$
\vec{c_i} = \vec{b_i} – \frac{\alpha_i}{\alpha_j}\vec{b_j}, \qquad i = 1, \ldots, j-1, \qquad \vec{c_i}=\vec{b_{i+1}} – \frac{\alpha_{i+1}}{\alpha_j}\vec{b_j}, \qquad i = j, \ldots, k-1
$$

These $k-1$ vectors are linearly independent: If $\sum_{i=1}^{k-1} \beta_i \vec{c_i}=0$ then

\begin{equation} \tag{1}\label{eq:1}
\sum_{i=1}^{j-1}\beta_i
\begin{bmatrix}
\vec{b_i} \\
\alpha_i
\end{bmatrix}
+ \gamma
\begin{bmatrix}
\vec{b_j} \\
\alpha_j
\end{bmatrix}
+ \sum_{i=j+1}^k \beta_{i-1}
\begin{bmatrix}
\vec{b_i} \\
\alpha_i
\end{bmatrix}
=0
\end{equation}

with
$$
\gamma = -\frac{1}{\alpha_j}(\sum_{i=1}^{j-1}\beta_i\alpha_i + \sum_{i=j+1}^k \beta_{i-1}\alpha_i).
$$

Since the vectors $\vec{a_i}=(\vec{b_i}, \alpha_i)$ are linearly independent, the equality $\eqref{eq:1}$ only hold when all the coefficients $\beta_i$ and $\gamma$ are all zero. This in turns implies that the vectors $\vec{c_i}, \cdots, \vec{c_{k-1}}$ are linearly independent. By the induction hypothesis $k-1 \leq n-1$, so we have established that $k \leq n$.

My Question:

After reading some times, the ideas of the proof in my understanding:

  • First prove Independence-dimension inequality holds for $n$-vectors when $n=1$
  • Then proves when $n>=2$ if Independence-dimension inequality holds for $n-1$-vectors, then it holds for $n$-vectors

I stuck with part 2. How the equation $\eqref{eq:1}$ comes from? Especially about the $\gamma$.

Best Answer

If $\sum_{i=1}^{k-1}\beta_i\vec{c}_i=0$, then we have

$$\sum_{i=1}^{j-1}\beta_i\vec{c}_i + \sum_{i=j}^{k-1}\beta_i \vec{c}_i=0$$

Then we have from definition of $\vec{c}_i$,

$$\sum_{i=1}^{j-1}\beta_i\left(\vec{b}_i-\frac{\alpha_i}{\alpha_j}\vec{b}_j\right) + \sum_{i=j}^{k-1}\beta_i \left(\vec{b}_{i+1}-\frac{\alpha_{i+1}}{\alpha_j}\vec{b}_j\right)=0$$

By shifting index by a place in the second sum, we have

$$\sum_{i=1}^{j-1}\beta_i\left(\vec{b}_i-\frac{\alpha_i}{\alpha_j}\vec{b}_j\right) + \sum_{i=j+1}^{k}\beta_{i-1} \left(\vec{b}_{i}-\frac{\alpha_{i}}{\alpha_j}\vec{b}_j\right)=0$$

Let's explicitly write out the $\vec{b}_j$ term clearly,

$$\sum_{i=1}^{j-1}\beta_i \vec{b}_i-\frac1{\alpha_j}\left(\sum_{i=1}^{j-1}\beta_i\alpha_i + \sum_{i=j+1}^{k}\beta_{i-1} \alpha_{i}\right)\vec{b}_j + \sum_{i=j+1}^{k}\beta_{i-1}\vec{b}_i=0$$

We let the coefficient of $\vec{b}_j$ be known as $\gamma$,

Hence $\gamma = -\frac1{\alpha_j}\left(\sum_{i=1}^{j-1}\beta_i\alpha_i + \sum_{i=j+1}^{k}\beta_{i-1} \alpha_{i}\right)$ and we have

$$\sum_{i=1}^{j-1}\beta_i \vec{b}_i+\gamma\vec{b}_j + \sum_{i=j+1}^{k}\beta_{i-1}\vec{b}_i=0\tag{2}$$

Also check that

\begin{align}&\sum_{i=1}^{j-1}\beta_i \alpha_i+\gamma\alpha_j + \sum_{i=j+1}^{k}\beta_{i-1}\alpha_i\\&=\sum_{i=1}^{j-1}\beta_i \alpha_i -\frac1{\alpha_j}\left(\sum_{i=1}^{j-1}\beta_i\alpha_i + \sum_{i=j+1}^{k}\beta_{i-1} \alpha_{i}\right)\alpha_j + \sum_{i=j+1}^{k}\beta_{i-1}\alpha_i=0 \tag{3}\end{align}

Equation $(1)$ is just concatenation of equation $(2)$ and equation $(3)$.

The expression might seems cryptic to you, what we usually do in developing proof is something called working backward, that is on a draft paper, we construct the expression of $\gamma$ from equation $(3)$ first.

Related Question