In your first assertion you wrote $\Rightarrow$, which is correct, but in fact the converse $\Leftarrow$ is also (almost) true. There are several equivalent criterion for diagonalizability. I'll list them out for you.
Let $V$ be a finite-dimensional vector space over a field $F$, and let $T:V \to V$ be a linear map. Then the following statements are all equivalent.
- $T$ is diagonalizable.
- There is an ordered basis $\beta$ of $V$ consisting of eigenvectors of $T$.
- We can express the vector space $V$ as a direct sum of eigenspaces of $T$:
\begin{align}
V = \bigoplus \limits_{\lambda \in \sigma(T)} \ker(T-\lambda I)
\end{align}
- The characteristic polynomial of $T$ splits over $F$, and for every eigenvalue $\lambda$ of $T$,
\begin{align}
\dim \ker(T - \lambda I) = \text{algebraic multiplicity of $\lambda$}
\end{align}
(i.e geometric multiplicity = algebraic multiplicity)
There are a few more equivalent statements if you know about Jordan canonical forms and minimal polynomials, however, for now, you should try to prove that the $2^{\text{nd}}$ and $3^{\text{rd}}$ statements are equivalent.
Edit:
After looking at ThorWittich's answer, I modified my answer to include the assumption that the characteristic polynomial splits (I implicitly assumed this fact throughout the discussion as is usually done, but strictly speaking I should have explicitly stated this).
Edit 2:
The assumption that the characteristic polynomial splits over the given field is important. To show this, consider the matrix
\begin{align}
A =
\begin{pmatrix}
0 & -1 \\
1 & 0
\end{pmatrix} \in M_{2 \times 2}(\Bbb{R})
\end{align}
So we are working over the field $\Bbb{R}$. It is easy to see that the characteristic polynomial is $\chi_A(t) = t^2 + 1$, which doesn't split over $\Bbb{R}$. Hence, $A$ is NOT diagonalizable over $\Bbb{R}$. However, if we consider $A$ to be an element of $M_{2 \times 2}(\Bbb{C})$, then the characteristic polynomial splits over $\Bbb{C}$, and there are two eigenvalues of $A$, namely $i$ and $-i$, and it's easy to verify (either directly or by using the 4th condition above) that $A$ is diagonalizable over $\Bbb{C}$.
Hence, the statement that the characteristic polynomial splits over the given field is actually important.
The algebraic multiplicity of an eigenvalue $\gamma$ of $T\in\mathcal L(X)$, where $X$ is a Hilbert space, equals $\dim \cup_{k=1}^\infty \ker(T-\gamma I)^k$, which in the case of a compact operator equals $\dim \ker(T-\gamma I)^m$ for the $m$ in your question.
The latter holds because 1. clearly always $\ker(T-\gamma I)^m\subset \ker(T-\gamma I)^{m+1}$, and 2. for compact operators there exists a maximal $m$ such that the "$\subset$" is "$\subsetneq$".
Corresponding vectors are called generalized eigenvectors (of order $k$, where $k$ is minimal).
References
This and the fact on compact operators is stated on p. 24, section 1.4.2 of Finite Element Methods for Eigenvalue Problems, by Jiguang Sun, Aihui Zhou, CRC Press, 19.8.2016 - 343 pp. (By p. 18, X is a Hilbert space.) Link to page 24: https://books.google.fi/books?id=YC7FDAAAQBAJ&pg=PA24
Theory and Applications of Volterra Operators in Hilbert Space,
by Israel Gohberg, M. G. Krein, p. 49, footnote 30 states the same for $\gamma\ne0$:
https://books.google.fi/books?id=HUkR9eQhKLYC&pg=PA49
Note: $(T-\gamma I)^m = T^m -\gamma T^{m-1} + \cdots +(-\gamma)^m I=T'+(-\gamma)^m I$, where $T'$ is compact. Therefore, the algebraic multiplicity of a nonzero eigenvalue of a compact operator is always finite (use Rudin F.A., Theorem 4.25a). (That of $0$ may be infinite; take $T=0$.)
BTW, "analytic multiplicity" may be different from the algebraic and geometric ones. "Geometric multiplicity" is $\dim \ker(T-\gamma I)$, hence at most the algebraic one.
Edit: much the same is said here, including a proof for the existence of a maximal $m$ (still assuming a Hilbert space; I haven't checked if it is necessary): https://math.stackexchange.com/a/406371
Best Answer
There is a geometric interpretation that I find helpful.
Consider for example the matrix $\begin{pmatrix} 1 & 0 \\ 5 & 1 \end{pmatrix}$, and think of it as a linear transformation on the plane $\mathbb{R}^2$, namely a linear function $T :\mathbb{R}^2 \to \mathbb{R}^2$ given by the formula $$T(x,y) = (x,y) \begin{pmatrix} 1 & 0 \\ 5 & 1 \end{pmatrix} = (x+5y,y) $$ The eigenvalue $\lambda=1$ has algebraic multiplicity 2 but there is only 1 linearly independent eigenvector, namely $(1,0)$ (or anything parallel to it). You can visualize the linear transformation $T$ as follows:
This kind of behavior is sometimes described by saying that $T$ is a "shearing transformation" or a "shear mapping".
Shearing maps can be used to give a general description of the behavior when the algebraic multiplicity of an eigenvalue is higher than its geometric multiplicity: the matrix can be decomposed as a product of a "pure shearing transformation" (like the example I just gave) composed with a diagonalizable matrix.