The Cayley-Hamilton theorem states that every matrix $A$ satisfies its own characteristic polynomial; that is the polynomial for which the roots are the eigenvalues of the matrix:
$p(\lambda)=\det[A-\lambda\mathbb{I}]$.
If you view the polynomial:
$a^2-5a+6=0$,
as a characteristic polynomial with roots $a=2,3$, then any matrix with eigenvalues that are any combination of 2 or 3 will satisfy the matrix polynomial:
$A^2-5A+6\mathbb{I}=0$,
that is any matrix similar to:
$\begin{pmatrix}3 & 0\\ 0 & 3\end{pmatrix}$,$\begin{pmatrix}2 & 0\\ 0 & 2\end{pmatrix}$,$\begin{pmatrix}2 & 0\\ 0 & 3\end{pmatrix}$. Note:$\begin{pmatrix}3 & 0\\ 0 & 2\end{pmatrix}$ is similar to $\begin{pmatrix}2 & 0\\ 0 & 3\end{pmatrix}$.
To see why this is true, imagine $A$ is diagonalized by some matrix $S$ to give a diagonal matrix $D$ containing the eigenvalues $D_{i,i}=e_i$, $i=1..n$, that is:
$A=SDS^{-1}$, $SS^{-1}=\mathbb{I}$.
This implies:
$A^2-5A+6\mathbb{I}=0$,
$SDS^{-1}SDS^{-1}-5SDS^{-1}+6\mathbb{I}=0$,
$S^{-1}\left(SD^2S^{-1}-5SDS^{-1}+6\mathbb{I}\right)S=0$,
$D^2-5D+6\mathbb{I}=0$,
and because $D$ is diagonal, for this to hold each diagonal entry of $D$ must satisfy this polynomial:
$D_{i,i}^2-5D_{i,i}+6=0$,
but the diagonal entries are the eigenvalues of $A$ and thus it follows that the polynomial is satisfied by $A$ iff the polynomial is satisfied by the eigenvalues of $A$.
Saying that there are "four independent variables" does not constitute a subspace. For example, the subset of all vectors in $\Bbb{R}^2$ such that $a_1+a_2+1=0$ is not a subspace despite there being an independent variable. Instead, you should use the subspace test.
Let $A, B \in V$ and $c \in \Bbb{R}$. We must show that:
$$A+cB \in V$$
This can be otherwise stated as:
$$(A+Bv)\pmatrix{1\\2\\3}=\pmatrix{0\\0}$$
Distribute on the left-hand side to get:
$$(A+Bv)\pmatrix{1\\2\\3}=A\pmatrix{1\\2\\3}+cB\pmatrix{1\\2\\3}$$
Now, we are given $A, B \in V$, so obviously, those matrices times $\pmatrix{1\\2\\3}$ are $\pmatrix{0\\0}$:
$$(A+Bv)\pmatrix{1\\2\\3}=\pmatrix{0\\0}+c\pmatrix{0\\0}=\pmatrix{0\\0}$$
This proves the equation we wanted, so $V$ is a subspace.
Now, to find the dimension, we set up what you did the first time:
$$\text{Let }A=\pmatrix{a_1 & a_2 & a_3 \\ a_4 & a_5 & a_6}$$
$$\pmatrix{a_1 & a_2 & a_3 \\ a_4 & a_5 & a_6}\pmatrix{1\\2\\3}=\pmatrix{0\\0}$$
$$a_1+2a_2+3a_3=0 \\ a_4+2a_5+3a_6=0$$
If we express $A$ as a column vector with $a_i$s rather than a matrix, the above corresponds to the following matrix-vector product equation:
$$\pmatrix{1 & 2 & 3 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 2 & 3}\pmatrix{a_1\\a_2\\a_3\\a_4\\a_5\\a_6}=\pmatrix{0\\0}$$
Now, basically, we are finding the dimension of the null space of the matrix on the left. There are two pivot columns and six columns in all, giving us a dimension of $4$.
Best Answer
Following the hint of @P.Quinton, we can rewrite the problem as $(A-B)^2=B^2$, and that reduces the problem to finding all $C$ such that $C^2=B^2$. The difficulty here is that $B$ has a repeated eigenvalue, which means that finding all the square roots of $B^2$ is not quite straight forward. After all, any reflection will square to the identity, and so on that eigenspace we can take something that restricts to a reflection.
We first note that $B$ is diagonalizable, with $B=PDP^{-1}$, with $D=\operatorname{diag}(1,1,6)$.
Because of this, $B^2$ will satisfy the polynomial $f(x)=(x-1)(x-6^2)$, and hence so will $C$. But then $C$ will satisfy the polynomial $f(x^2)=(x^2-1)(x^2-6^2)=(x-1)(x+1)(x-6)(x+6)$ which has no repeated roots, so $C$ is also diagonalizable. If $v$ is an eigenvector of $C$, then it is also an eigenvector of $C^2=B^2$, The problem is that when we square $C$, things that weren't eigenvectors for $C$ can become eigenvectors for $C^2$, for example if $Cu=u$ and $Cv=-v$, then $C^2(u+v)=u+v$, but $C(u+v)=u-v\neq \lambda(u+v)$. But if we pick any two vectors that span $\ker(C^2-I)$, then we can pick them to be eigenvectors in $C$. If they both have eigenvalue $1$ or both have eigenvalue $-1$, then the matrix we get for $C$ will not depend on our choice because $C$ restricted to their span will be $\pm I$. If they have different signs, then the choices matter.
For the sake of computation, it is easier to find square roots of $D^2$ and then conjugate to turn those into square roots of $B^2$. If we follow the proceedure above, then we will get block diagonal matrices, $\operatorname{diag}(M,\pm 6)$ where the first block is $2\times 2$ and the last block is $1\times 1$. The first block, $M$ is a matrix satisfying $M^2=I_2$ (the $2\times 2$ identity matrix), which is either $\pm I$ or a matrix with eigenvalues $1$ and $-1$, which can be found easily by specifying a trace of $0$ and a determinant of $-1$. In particular, if you set the first column of $M$, those two conditions will uniquely determine the second column.
Putting everything together, $C$ is of the form $PNP^{-1}$ where $P$ was the matrix used in the diagonalization of $B$ and
$$N=\begin{pmatrix} a & b & 0 \\ \frac{1-a^2}{b} & -a & 0 \\ 0 & 0 & \pm 6 \end{pmatrix}$$
(except when $b=0$, in which case $a=\pm 1$ and the $(2,1)$ entry can be anything), or where $N$ is one of the other forms already discussed.