For (1), see the citation in my answer to a previous question. In particular, yes, the set of all traceless matrices are precisely the set of all commutators, regardless of the underlying field.
The exercise in Hoffman and Kunze asks whether the subspace of all traceless matrices is equal to the subspace spanned by all commutators. This is different from asking whether the subspace of all traceless matrices is equal to the set of all commutators. Put it another way, the exercise in Hoffman and Kunze evades the question of whether all commutators form a matrix subspace.
For (2), see my aforementioned answer again.
For (3) and (4), consider $I_2$ over $\mathbb{F}_2$.
This answer assumes the matrices are taken over $\mathbb C$.
Yes, the statement is still true even if the matrix isn't diagonalizable.
For the proof you saw it is sufficient that $D$ can be taken to be an upper triangular matrix (and it can be taken in such a way, this is Schur's Decomposition Theorem). This is enough because its diagonal entries will be the eigenvalues of the starting matrix.
Jordan Canonical Form is also sufficient, but Schur's Decomposition is a weaker condition.
For completeness I'll add the proofs here.
Let $n\in \mathbb N$ and $A\in \mathcal M_n(\mathbb C)$. Let $\lambda _1, \ldots ,\lambda _n$ be the eigenvalues of $A$. The characteristic polynomial $p_A(z)$ of $A$ is $\color{grey}{p_A(z)=}(z-\lambda _1)\ldots (z-\lambda _n)$.
Schur's Decomposition guarantees the existence of an invertible matrix $P$ and an upper triangular matrix $U$ such that $A=PUP^{-1}$ and $U$'s diagonal entries are exactly $\lambda _1, \ldots ,\lambda _n$.
Since similarity preserves the characteristic polynomial, it follows that the characteristic polynomial $p_U(z)$ of $U$ is $\color{grey}{p_U(z)=}(z-\lambda _1)\ldots (z-\lambda _n)$, therefore $U$ and $A$ have the same eigenvalues with the same algebraic multiplicity.
From the fact that $U$'s diagonal entries are $\lambda _1, \ldots ,\lambda _n$ it follows that the trace of $U$ is the sum of the eigenvalues of $A$ and the determinant of $U$ is the product of the eigenvalues of $A$.
Trace properties yield the following $$\text{tr}(A)=\text{tr}\left(PUP^{-1}\right)=\text{tr}\left(UP^{-1}P\right)=\text{tr}(U),$$ thus proving that the sum of the eigenvalues of $A$ equals $\text{tr}(A)$.
Similarly for the determinant it holds that $$\det(A)=\det\left(PUP^{-1}\right)=\det\left(P\right)\det\left(U\right)\det\left(P^{-1}\right)=\det(U),$$
hence the product of teh eigenvalues of $A$ equals the determinant of $A$.
Best Answer
Yes. Just look at the characteristic polynomial (say of degree n). Trace=-the coefficient of the term of $x^{(n-1)}$ which is also the sum of the roots of the characteristic polynomial (the coefficient of the term $x^{(n-1)}$ of any monic polynomial of degree $n$ is the sum of its roots with a minus sign.).