The definitions are there to highlight sets that are important to understanding the properties of the linear transformation T. Since $T:V\rightarrow W$ the kernel of T is every element of $V$ that T transforms into $0$. The range of T is every element of $W$ that is a transformation of an element of $V$.
So, some simple examples:
Let $T:\mathbb{R} \rightarrow \mathbb{R}$ be given by $T(x)=x$. Then Ker$(T)$ = $\{0\}$ (no other element of $\mathbb{R}$ is zero and T is the identity map) and Range$(T)$ is $\mathbb{R}$ because every element of $\mathbb{R}$ is used up by T.
Let $T:\mathbb{R}^2 \rightarrow \mathbb{R}^2$ be given by $T(x,y) = (x+y, x-y)$. The kernel here is all elements of $\mathbb{R}^2$ that map to $(0,0)$ under T. This means solving the simultaneous equations $x+y=0$ and $x-y=0$ and you can see that $(0,0)$ is the only solution. So Ker$(T)=\{(0,0)\}$. Range$(T)$ is $\mathbb{R}^2$ again, because if you pick any target point $(\alpha, \beta)$ and solve the simultaneous equations $x+y=\alpha$ and $x-y=\beta$ then you find $x=(1/2)(\alpha+\beta)$ and $y={1/2}(\beta-\alpha)$ , i.e. there is a value (x,y) that T turns into $(\alpha, \beta)$.
What is the theorem telling you? It's telling you that these sets have structure; they're not just random collections of points.
In both the above examples the Kernel consists of the origin, and so is a 0-dimensional subspace. If we had an example where the kernel was bigger, it would have to have at least 1 dimension (subspaces have integer dimensions), so it would be a line (or plane, or hyperplane as the number of dimensions increase). In other words, all the elements that T map to zero are related to each other: you can find the line that T maps to zero (that's what the kernel gives you).
As an example here, consider $T:\mathbb{R}^2 \rightarrow \mathbb{R}^2$ given by $T(x,y) = (x-y, 0)$. The kernel of T is now $\{(x,y) \in \mathbb{R}^2 : x=y\}$. This is a line in $\mathbb{R}^2$, and T maps any point on it to $(0,0)$.
The range also has structure in the same way (but you expect this because T has structure and T defines the range).
Note also that if the kernel of a linear transformation is just the zero element then the transformation must be injective (one-to-one), which is often very useful to know.
For the kernel of $T$ you want those polynomials in $V$ that map to the zero matrix in $W$. So
\begin{align*}
T(a+bx+cx^2) & = \begin{bmatrix}0&0\\0&0\end{bmatrix}\\
\begin{bmatrix} a-b & b-c \\ 0 & c-a \end{bmatrix} & = \begin{bmatrix}0&0\\0&0\end{bmatrix}.
\end{align*}
This gives $a=b=c$. So the polynomials which lie in the kernel are of the form $a(1+x+x^2)$, where $a \in \mathbb{R}$. So a basis for the kernel is
$$\mathcal{B}_{\text{ker}}=\{1+x+x^2\} \implies \dim(\text{Ker} T)=1$$
Likewise we can go for a basis for the range. First we can get the range: assume that $\begin{bmatrix}p&q\\r&s\end{bmatrix} \in \text{Range }(T)$, then there exists some polynomial $a+bx+cx^2 \in V$ such that
\begin{align*}
T(a+bx+cx^2) & =\begin{bmatrix}p&q\\r&s\end{bmatrix}\\
\begin{bmatrix} a-b & b-c \\ 0 & c-a \end{bmatrix}& = \begin{bmatrix}p&q\\r&s\end{bmatrix}
\end{align*}
This gives the following system:
$$
\begin{align*}
a-b & = p\\
b-c & = q\\
0 & = r\\
c-a&=s
\end{align*}
\Longrightarrow
\begin{bmatrix}
1&-1&0&|&p\\
0&1&-1&|&q\\
0&0&0&|&r\\
-1&0&1&|&s
\end{bmatrix}
\Longrightarrow
\begin{bmatrix}
1&-1&0&|&p\\
0&1&-1&|&q\\
0&0&0&|&r\\
0&0&0&|&p+q+s
\end{bmatrix}
$$
From this it follows that the range only consists of matrices of the form $\begin{bmatrix}p&q\\0&-p-q\end{bmatrix}$. Now we can go for a basis for the range of $T$ as follows:
$$\begin{bmatrix}p&q\\0&-p-q\end{bmatrix}=p\begin{bmatrix}1&0\\0&-1\end{bmatrix}+q\begin{bmatrix}0&1\\0&-1\end{bmatrix}$$
This shows that
$$\mathcal{B}_{\text{range}}=\left\{\begin{bmatrix}1&0\\0&-1\end{bmatrix},\begin{bmatrix}0&1\\0&-1\end{bmatrix}\right\} \implies \dim(\text{Range } T)=2$$
Best Answer
Note that $T$ is surjective since for $a\in\Bbb R$ we have $T(A)=a$ where $$ A=\begin{bmatrix}a & 0\\ 0 & 0\end{bmatrix} $$ Of course, this implies $\{1\}$ is a basis for $\DeclareMathOperator{Image}{Image}\Image T$.
The Rank-Nullity theorem states $$ \dim\ker T+\dim\Image T=\dim M_{2\times 2} $$ Since $\Image T=\Bbb R$ and since \begin{align*} \dim\Bbb R &= 1 & \dim M_{2\times 2}&=4 \end{align*} it follows that $$ \dim\ker T=4-1=3 $$ So, to find a basis for $\ker T$, it suffices to find three linearly independent matrices in the kernel of $T$. But it can easily be checked that \begin{align*} \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} && \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix} && \begin{bmatrix} 0 & 0 \\ 1 & 0 \end{bmatrix} \end{align*} are three such matrices.