Let $A$ be a $n\times n$ matrix for some $n\times n$, over some algebraically closed field. The following holds:
$$A \text{ is nilpotent }\iff A\text {'s only eigenvalue is }0.$$
Question 1: Does $A$ being nilpotent imply its diagonal entries are all $0$ ?
According to the above characterization of nilpotency, absolutely not.
Take for instance the matrix $\begin{bmatrix} -3 & -1\\ 9 & 3\end{bmatrix}$. It's easy to check that $\begin{bmatrix} -3 & -1\\ 9 & 3\end{bmatrix}^2=0_{2\times 2}$.
Question 2: Is any matrix with only $0$'s on the diagonal entries necessarily nilpotent?
Again, no. There are matrices in these conditions which don't even have $0$ has an eigenvalue. For instance $\begin{bmatrix} 0 & 1\\ 1 & 0\end{bmatrix}$.
More can be said, but it depends on your knowledge whether it is worth saying or not. As far as I know it all comes down to a matrix's Jordan Normal Form:
Any nilpotent matrix $n\times n$ is similar to some block diagonal matrix $$ {\begin{bmatrix}
\color{blue}{J_1} & 0 &\dots &\dots & 0\\
0 & \color{blue}{J_2} & 0 & \dots &0\\
\vdots & \ddots & \ddots & \ddots &\vdots\\
\vdots & \ddots & \ddots &\ddots & 0\\
0 & \dots & \dots & 0 & \color{blue}{ J_k}\\
\end{bmatrix}}_{n\times n},$$ for some $k\in \Bbb N$. Where, for each $i\in \{1\ldots ,k\},\,J_i$ is a $m_i\times m_i$ matrix, for some $m_i\in \Bbb N$, that looks like
$$\begin{bmatrix}0& 1 &&& \\
& 0 & 1 &\huge 0& \\
& & \ddots & \ddots &\\
&\huge 0 && 0 &1 \\
&&& & 0 \\
\end{bmatrix}_{m_i\times m_i}.$$
You can literally read a basis for the nullspace of a matrix from its rref form. I describe the procedure in some detail here.
As this process consists of solving a few linear equations, it is easily automated: augment the transpose of the rref matrix with the appropriately-sized identity and row-reduce again, as you might do to compute the inverse of a matrix. The kernel basis will appear as if by magic on the augmented side of the zero rows of the resulting matrix. Taking the two larger examples from the linked answer, $$\begin{align}
\pmatrix{1&0&2&-3\\0&1&-1&2\\0&0&0&0} &\to \left(\begin{array}{ccc|cccc}1&0&0&1&0&0&0\\0&1&0&0&1&0&0\\2&-1&0&0&0&1&0\\-3&2&0&0&0&0&1\end{array}\right) \\
&\to \left(\begin{array}{ccc|cccc}1&0&0&1&0&0&0\\0&1&0&0&1&0&0\\0&0&0&-2&1&1&0\\0&0&0&3&-2&0&1\end{array}\right)
\end{align}$$ and $$\begin{align}
\pmatrix{1&2&0&2\\0&0&1&-1\\0&0&0&0\\0&0&0&0} &\to \left(\begin{array}{cccc|cccc}1&0&0&0&1&0&0&0\\2&0&0&0&0&1&0&0\\0&1&0&0&0&0&1&0\\2&-1&0&0&0&0&0&1\end{array}\right) \\
&\to \left(\begin{array}{cccc|cccc}1&0&0&0&1&0&0&0\\0&1&0&0&0&0&1&0\\0&0&0&0&-2&1&0&0\\0&0&0&0&-2&0&1&1\end{array}\right).
\end{align}$$
In fact, if you apply this process to the transpose of the original matrix, you get everything at once: the non-zero rows of the rref side are a basis for the image, while the augmented side of the zero rows are a basis for the kernel. This doesn’t give the nicest form of the kernel basis, however. The vectors you get will be (sometimes large) scalar multiples of the vectors that you would have gotten by computing the kernel separately. Here are a couple of examples: $$
M=\pmatrix{0&4&-4&8\\2&4&0&2\\3&0&6&9} \\
\left(\begin{array}{ccc|cccc}0&2&3&1&0&0&0\\
4&4&0&0&1&0&0\\
-4&0&6&0&0&1&0\\
8&2&9&0&0&0&1\end{array}\right) \to
\left(\begin{array}{ccc|cccc}1&0&0&\frac14&\frac1{12}&0&\frac1{12}\\
0&1&0&\frac14&\frac16&0&-\frac1{12}\\
0&0&1&\frac16&-\frac19&0&\frac1{18}\\
0&0&0&-288&144&144&0\end{array}\right).
$$ Computing the kernel separately yields $(-2,1,1,0)^T$.
$$
A=\pmatrix{2&4&2&2\\1&3&2&0\\3&1&-2&8} \\
\left(\begin{array}{ccc|cccc}2&1&3&1&0&0&0\\
4&3&1&0&1&0&0\\
2&2&-2&0&0&1&0\\
2&0&8&0&0&0&1\end{array}\right) \to
\left(\begin{array}{ccc|cccc}1&0&4&\frac32&-\frac12&0&0\\
0&1&-5&-2&1&0&0\\
0&0&0&2&-2&2&0\\
0&0&0&-6&2&0&2\end{array}\right).
$$ The separate computation yields $(1,-1,1,0)^T$ and $(-3,1,0,1)^T$ for the kernel. I’ll leave it to you to verify that the rref side of these two examples does indeed hold a basis for the image.
Best Answer
If I am understanding your question correctly, I believe you are slightly misunderstanding what the Span is.
Essentially, the span of a set $S$ (where elements of $S$ are vectors) is the set of all linear combinations of said vectors. In other words the term "span of a matrix" is not well-defined, but you can take the span of the columns of a matrix (or its rows).
Before we continue, let us define the Null Space of a matrix $A$. The elements of the Null Space are vectors $\vec{x}$ such that: $$A\vec{x}=0$$ It turns out that we can write such a Null Space of a matrix as the span of some columns of $A$ (not necessarily all of them).
If $A$ turns out to be linearly independent, then there will be no such vector, and the Null Space only contains the $\vec{0}$ vector, and as such the dimension of the Null Space of $A$ is 0.
When you say "a 2-dimensional null-space vector take away 2-dimensional span of a matrix", I believe you are referring to an intuition for what we call the Rank-Nullity Theorem, which says that if $T:V\rightarrow W$ is a linear transformation (a matrix), then: $$dim(Range(T))+dim(Null(T)) = dim(V)$$
The range of $T$ is essentially the span of all the columns of $T$, and so if the dimension of the Null Space is non-zero, this essentially means that the number of linearly independent vectors that spans $Range(T)$ is $dim(V)-dim(Null(T))$.