[Math] How to understand rank-nullity / dimension theorem proof

linear algebravector-spaces

OK, I am working on proofs of the rank-nullity (otherwise in my class known as the dimension theorem).

Here's a proof that my professor gave in the class. I want to be sure I understand the reasoning. So I will lay out what he had here with a less-precise layman's wording, as I want to be sure I know what I am doing. It makes the proof easier to memorize for me.

So:

Let V and V be vector spaces.

T:V→W is linear and V is finite-dimensional and

function $f \in Hom_K (V,W)$

Let dim(V) = n for some n$\in \mathbb N$ and dim(ker($f$) = $r$

dim(V) = nullity(T) + rank(T) = dim(ker($ f$ ) + dim(Im($ f$ ))

in some notations (like the one in our text) this wold look like dim(V) = nullity(T) + rank(T) = dim(N(T)) + dim(R(T))

on to the proof:

$ker(f) \subseteq V$. And it is a subspace.

Why a subspace? Because, since the kernel of any function is the set of vectors that goes to zero, adding to those vectors another vector in V will still be in V, a will multiplying them (since they go to zero).

since we let dim(V)=n all the bases (basis-es?) of V will have n elements.

therefore $ \exists$ a basis {$x_1 , x_2 , … , x_r$} of $ ker(f)$ where r≤n.

(The reason is that any basis will have an equal or lesser number of dimensions than the space it describes. ker(f) is a subspace).

by the exchange lemma, which says that given any linearly independent subset
$ \exists {y_1 , y_2 , … y_s } \in V $ such that {$y_1 , y_2 , … y_n $}$ \cap $ {$x_1 , x_2 ,… ,x_r$}$ = \varnothing $
the next step says that {$y_1 , y_2 , … y_n$}$ \cup ${$x_1 , x_2 ,… ,x_r$} is a basis of V.

Now, my question is if that is because the intersection of the two sets is the empty set and they are linearly independent?

After that, we get to saying that {$ f(y_1), f(y_2),… f(y_n)$ } is a basis of Im($ f$ ).

But I am not sure why that is.

He then says we can claim the following:
$ λ_1 f(y_1) + λ_2 f(y_2)+… +λ_s f(y_s)= 0 $

for some $λ_1, λ_2, …, λ_s $ \in $ K

so taking
$$f \Big( \sum_{i=1}^s x_i y_i \Big) = 0 $$
we can make that into
$$ \Big[ \Big( \sum_{i=1}^s \lambda_i f(y_i) \Big) \Big] = \sum_{i=1}^s \lambda_i y_i \in ker(f)$$

That step I am a bit fuzzy on the reasoning. IIUC, it's just saying that taking the sum of f using the union of x an y sets equals zero (its just f(x,y) ) and the summation of the product of λ and all the f(y) terms is the same as the sum of all the λy terms and they are all in the kernel of f. But I wanted to be sure.

He then said that the above implies that there exists some set of scalars, $α_1, α_2, … α_s \in$ K s.t.

$$ \sum_{i=1}^s \lambda_i y_i = \sum_{j=1}^r y_i x_j$$ and that further implies

$$\sum_{j=1}^r \alpha_j x_j – \sum_{i=1}^s \alpha_i x_i = 0 $$

which implies $α_j, λi = 0$ for all 1≤j≤r and 1≤i≤s.

and that further implies that the set {$f(y_1), f(y_2), … ,f(y_s )$} is linearly independent.

Then he says: for all z in the Im(f) there exists x$ \in V$ s.t. $z=f(x)$ (this seems obvious at one level but I felt like it was just sleight of hand).

then

$z = f \Big(\sum_{j=1}^r \alpha_j x_j – \sum_{i=1}^s \lambda_i y_i \Big) = \sum_{j=1}^r \alpha_j f(x_i) + \sum_{i=1}^s x_i f(y_i)= 0 + \sum_{i=1}^s x_i f(y_i)$

and then he says dim(V) = r + s = dim(ker(f)) + dim(Im(f))

its the last few steps I can't seem to justify in my head. Any help would be appreciated (and seeing if I copied this wrong from the board).

Best Answer

Perhaps modifying your notation just a bit? $T: V \rightarrow W$ where $dim(V)=n$ and $dim(W)=m$ our goal is to prove that $dim(V) = dim(Null(T))+dim(range(T))$ where $dim(Null(T)) = r$ and $dim(range(T))= rank(T)=s$. To prove this dimension theorem we need to exhibit bases (yes, that's it) which serve to form minimal spanning sets for the null-space and range of $T$.

One approach, pick a basis for $V$, study the matrix for $T$ and steal this theorem from the corresponding theorem for rank and nullity of a matrix. That theorem comes from the nuts and bolts of Gaussian elimination. I don't think that is what your professor intends, so back to the linear algebraic argument.

Note $ker(T) \leq V$ hence $ker(T)$ is a vector space and as it is a subspace of a finite-dimensional vector space it has a finite dimension as well, let's say $r$. Moreover, following your notation, $\beta_o=\{ x_1, x_2, \dots , x_r \}$. I assume at this point you have already proved in your class that if a vector space has a basis with finitely many elements then any such basis has the same number of vectors. We call this number the dimension of the vector space(or subspace).

It is also a simple exercise to show $T(V) \leq W$ hence there exists a basis $\beta_z=\{ z_1,z_2, \dots z_s \}$ for the range. To prove the theorem, we must show $r+s=n$.

If $z_j \in T(V)$ then there exists $y_j \in V$ such that $T(y_j)=z_j$. We show $\{y_1,y_2, \dots, y_s\}$ is linearly independent by supposing otherwise towards a contradiction: suppose $c_1y_1+c_2y_2+ \cdots + c_sy_s=0$ with at least one $c_j \neq 0$ then since $T(0)=0$ and the image of a linear combination is the linear combination of the images: $$ c_1T(y_1)+c_2T(y_2)+ \cdots + c_sT(y_s)=0 $$ Hence $c_1z_1+c_2z_2 + \cdots + c_sz_s=0$ with at least one $c_j \neq 0$ hence $\beta_z$ is linearly dependent. This contradicts our assumption that $\beta_z$ serves as a basis for the image of $T$. Therefore, $\{y_1,y_2, \dots , y_s \} = T^{-1}(\beta_z)$ is a linearly independent subset of $V$.

At this point, I would like to claim $\beta=\beta_o \cup T^{-1}(\beta_z)=\{x_1,x_2, \dots , x_r \} \cup \{y_1,y_2, \dots , y_s \}$ forms a LI subset of $V$. Notice $\beta_o \cap T^{-1}(\beta_z) = \emptyset$ (can you prove this?) Suppose $$c_1x_1+\cdots + c_rx_r+b_1y_1+ \cdots b_sy_s =0. $$ We know $\beta$ contains nonzero vectors hence any nontrivial solution must stem from at least two nonzero coefficients in the above sum. If those two or more coefficients appear in the $c_j$-terms then that cannot happen by LI of $\beta_o$. Likewise, within the $b_j$-coefficients we cannot find a linear dependence by LI of $T^{-1}(\beta_z)$. The only remaining possibility is that both $c_j$ and $b_j$ combine to give linear dependence, however, this is impossible since it contradicts the fact that the bases $\beta_o$ and $T^{-1}(\beta_z)$ are disjoint.

Next, ignoring the fact you may have other theorems to use, we must show $\beta$ spans $V$. Let $v \in V$ and suppose $T(v)=w$. There exist $\alpha_1,\alpha_2, \dots , \alpha_s$ such that $w = \alpha_1z_1+ \cdots \alpha_sz_s$ since $\beta_z$ forms basis of $T(V)$. Furthermore, $T(v-\alpha_1y_1- \cdots -\alpha_sy_s) = T(v)-w=0$ hence $v-\alpha_1y_1- \cdots -\alpha_sy_s \in ker(T)$. Thus, there exist $\beta_1, \dots , \beta_r$ such that $v-\alpha_1y_1- \cdots -\alpha_sy_s = \beta_1x_1+ \cdots +\beta_rx_r$. Consequently, $$ v=\alpha_1y_1+ \cdots \alpha_sy_s +\beta_1x_1+ \cdots +\beta_rx_r $$ which shows $v \in span(\beta)$ and as $v$ was arbitrary we find $span(\beta)=V$.

Therefore, $\beta$ forms a basis for $V$ and so (by that other theorem I'm not proving here) it has $n$-elements. But, $\beta$ also clearly has $r+s$ elements by its construction. Hence $r+s=n$ and the proof is complete.

Related Question