The definitions are there to highlight sets that are important to understanding the properties of the linear transformation T. Since $T:V\rightarrow W$ the kernel of T is every element of $V$ that T transforms into $0$. The range of T is every element of $W$ that is a transformation of an element of $V$.
So, some simple examples:
Let $T:\mathbb{R} \rightarrow \mathbb{R}$ be given by $T(x)=x$. Then Ker$(T)$ = $\{0\}$ (no other element of $\mathbb{R}$ is zero and T is the identity map) and Range$(T)$ is $\mathbb{R}$ because every element of $\mathbb{R}$ is used up by T.
Let $T:\mathbb{R}^2 \rightarrow \mathbb{R}^2$ be given by $T(x,y) = (x+y, x-y)$. The kernel here is all elements of $\mathbb{R}^2$ that map to $(0,0)$ under T. This means solving the simultaneous equations $x+y=0$ and $x-y=0$ and you can see that $(0,0)$ is the only solution. So Ker$(T)=\{(0,0)\}$. Range$(T)$ is $\mathbb{R}^2$ again, because if you pick any target point $(\alpha, \beta)$ and solve the simultaneous equations $x+y=\alpha$ and $x-y=\beta$ then you find $x=(1/2)(\alpha+\beta)$ and $y={1/2}(\beta-\alpha)$ , i.e. there is a value (x,y) that T turns into $(\alpha, \beta)$.
What is the theorem telling you? It's telling you that these sets have structure; they're not just random collections of points.
In both the above examples the Kernel consists of the origin, and so is a 0-dimensional subspace. If we had an example where the kernel was bigger, it would have to have at least 1 dimension (subspaces have integer dimensions), so it would be a line (or plane, or hyperplane as the number of dimensions increase). In other words, all the elements that T map to zero are related to each other: you can find the line that T maps to zero (that's what the kernel gives you).
As an example here, consider $T:\mathbb{R}^2 \rightarrow \mathbb{R}^2$ given by $T(x,y) = (x-y, 0)$. The kernel of T is now $\{(x,y) \in \mathbb{R}^2 : x=y\}$. This is a line in $\mathbb{R}^2$, and T maps any point on it to $(0,0)$.
The range also has structure in the same way (but you expect this because T has structure and T defines the range).
Note also that if the kernel of a linear transformation is just the zero element then the transformation must be injective (one-to-one), which is often very useful to know.
I have tried both using Grassman's formula and the relationship between the dimension of the kernel, image, and domain of a linear map (dimE=dimImf+dimKerf) with the corresponding inequalities due to the fact that v1,...,vr∈E are linearly independent, but I don't get a consistent proof for any of the implications.
None of those are going to be sufficient, because they're all results about dimensions, and we don't want a result about dimensions. There's a possibility that you could use it, combined with other results, but it's not actually needed here, and probably shouldn't be the first approach you think of.
Instead, first note that, if $E = \mathrm{Ker}(f) + \langle v_1,\ldots,v_r\rangle$, then we can choose some $w_1,\ldots,w_{n-r}\in\mathrm{Ker}(f)$ to extend $\langle v_1,\ldots,v_r\rangle$ to a basis of $E$. Then for any $u = \sum \alpha_iv_i + \sum\beta_iw_i$, we have $f(u) = \sum\alpha_if(v_i)+\sum\beta_if(w_i) = \sum\alpha_if(v_i)$ since $w_i\in\mathrm{Ker}(f)$, and note that this lies in $\langle f(v_1),\ldots,f(v_r)\rangle$, so $\mathrm{Im}(f)\subseteq\langle f(v_1),\ldots,f(v_r)\rangle$. The reverse inclusion is simple.
For the reverse implication, suppose that $E \neq \mathrm{Ker}(f) + \langle v_1,\ldots,v_r\rangle$. Then there is some $u \in E$ that does not lie in $\mathrm{Ker}(f) + \langle v_1,\ldots,v_r\rangle$. Now, if $f(u)$ lies in $\langle f(v_1),\ldots,f(v_r)\rangle$, then there are some $\alpha_1,\ldots,\alpha_r$ such that $f(u) = \sum\alpha_if(v_i) = f(\sum\alpha_iv_i)$. Thus, $w := u - \sum\alpha_iv_i$ has $f(w) = f(u) - f(\sum\alpha_iv_i) = 0$, so $w\in\mathrm{Ker}(f)$. But then, $u = w + \sum\alpha_iv_i$, so $u\in\mathrm{Ker}(f)+\langle v_1,\ldots,v_r\rangle$, a contradiction. Thus, $f(u)\not\in\langle f(v_1),\ldots,f(v_r)\rangle$, so $\mathrm{Im}(f)\neq \langle f(v_1),\ldots,f(v_r)\rangle$.
Best Answer
Essentially the same proof works when $ker(T)=\{0_V\}$. Let $\{Tw_1,Tw_2,..,Tw_k\}$ be a basis for the image of $T$. Take any $x \in V$. Then $Tx$ is a linear combination of the basis elements, say $Tx=\sum c_iTw_i$. But then $T(x-\sum a_iw_i)=0$ so $x=\sum a_iw_i$ We have proved that $V$ is spanned by $w_1,w_2,...,w_k$ so $V$ is finite dimensional.