Can I simplify this to pick out of the summation only the mode that matches the transition energy $\hbar \omega_{21}$? This would be because photons are emitted/absorbed entirely anyway, so the energies have to match, and the photons which do not have the right energy simply won't interact with the atom
That's exactly right.
Let's see it in detail.
Slightly nicer notation
$\newcommand{\ket}[1]{\left \lvert #1 \right\rangle}$
First let's make the notation slightly more standard.
We have $\hat{\sigma} \equiv |2\rangle\langle1|$ so that in the $|2\rangle, |2\rangle$ basis, $\hat{\sigma}^\dagger \hat{\sigma}$ has matrix representation
$$\left( \begin{array}{cc}
0 & 0 \\ 0 & 1
\end{array} \right) \, .$$
We're always free to put in an energy offset so we can shift to
$$\left( \begin{array}{cc}
-1/2 & 0 \\ 0 & 1/2
\end{array} \right) \, .$$
which is conveniently written as $- \sigma_z / 2$.
Therefore, our system's Hamiltonian is$^{[a]}$
\begin{align}
H / \hbar
=& H_A/\hbar + H_R/\hbar + H_I/\hbar \\
=& -\frac{\omega_{21}}{2} \sigma_z +
\sum_k \omega_k (a_k^\dagger a_k + 1/2) +
\sum_k g_k (a^\dagger_k + a_k) \sigma_x \, .
\end{align}
Interaction picture
This Hamiltonian is kind of complicated because there are so many interaction terms.
We can enormously simplify it by using the so-called "interaction picture".
The idea is to "unwind" the evolution of the atom and fields caused by their own Hamiltonians.
In the interaction picture formalism, you get a new effective Hamiltonian and the operators pick up a time dependence.
You construct the propagators of the atom and fields without the coupling:
$$U_A(t) \equiv \exp \left( -i H_A t / \hbar \right) = \exp \left(i \omega_{21} \sigma_z t / 2 \right)$$
and
$$U_R = \prod_k \exp \left(-i \omega_k t (a_k + a_k^\dagger + 1/2) \right) \, .$$
Then you find the time dependent operators.
For each operator $T$, it's time dependent version is
$$V(t) = U^\dagger(t) T U(t)$$
where here $U(t)$ means $U_A(t) \otimes U_R(t)$.
The interaction picture Hamiltonian is only the interaction part $H_I$, but with the time dependent versions of the operators.$^{[b]}$
So, all we have to do is figure out what $\sigma_x$ and $a$ look like with this time dependence.
You can do the algebra and you'll find that for the atom
$$
U_A(t)^\dagger \sigma_x U_A(t) = \cos(\omega_{21}t)\sigma_x + \sin(\omega_{21}t) \sigma_y = e^{i \omega_{21} t}\sigma_+ + e^{-i \omega_{21} t}\sigma_-
$$
where $\sigma_+ \equiv \left \lvert 2 \right\rangle \left \langle 1 \right \rvert$ and $\sigma_- = \sigma_+^\dagger$.
You can also find that in the interaction picture $a_k \rightarrow a e^{-i \omega_k t}a$.
Putting it all together, in the interaction picture our Hamiltonian is
\begin{align}
H'/\hbar
=&
\sum_k g_k
(a_k^\dagger e^{i \omega_k t} + a_k e^{-i \omega_k t})
(\sigma_+ e^{i \omega_{21} t} + \sigma_- e^{-i \omega_{21}t}) \\
=& \sum_k g_k \left(
a_k^\dagger \sigma_+ e^{i (\omega_k + \omega_{21})t} +
a_k^\dagger \sigma_- e^{i (\omega_k - \omega_{21})t} +
a_k \sigma_+ e^{i (-\omega_k + \omega_{21})t} +
a_k \sigma_- e^{i (-\omega_k - \omega_{21})t}
\right) \, .
\end{align}
Rotating wave approximation
Of these four terms, two of them have fast oscillating time dependencies because the frequencies in the exponent are either both negative or both positive.
In a lot of cases these fast oscillating time terms contribute much less to the dynamics than the slower terms.
This is roughly because the integral of an oscillating function stays small, while the integral of a constant function gets large.
Anyway, you can often drop those terms, leaving
\begin{align}
H'/\hbar
=& \sum_k g_k \left(
a_k^\dagger \sigma_- e^{i (\omega_k - \omega_{21})t} +
a_k \sigma_+ e^{-i (\omega_k - \omega_{21})t}
\right) \, .
\end{align}
Doing this is called the "rotating wave approximation.
Now we've got a sum over $k$ where each term has a frequency $\omega_k - \omega_{21}$.
Again, if the modes are spaced out, then you can have a situation where most of these frequencies are large, but there could be one radiation mode $l$ resonant with the atom, i.e. $\omega_l \approx \omega_{21}$.
Then, that term dominates and we get
$$H'/\hbar \approx g_l \left(
a_l^\dagger \sigma_- e^{i (\omega_l - \omega_{21})t} +
a_l \sigma_+ e^{-i (\omega_l - \omega_{21})t}
\right) \, .$$
If the frequencies are exactly the same (i.e. the atom and radiation mode $l$ are exactly on resonance), then you get
$$H'/\hbar \approx g_l \left(
a_l^\dagger \sigma_- + a_l \sigma_+ \right) \, .$$
In the subspace of radiation Fock states $\ket{n}$ and $\ket{n+1}$, this Hamiltonian is just proportional to $\sigma_x$, i.e. the quantum of energy oscillates between the atom and the radiation field.
Review
Your intuition about the atom only interacting with radiation modes of the same frequency is correct.
The "rotating wave approximation" is the name given to the approximation where you drop all the radiation modes that aren't resonant with the atom (of course, this works for things other than atoms and radiation).
The terms we dropped are ones that do not conserve energy.
It turns out they actually do have a role in real systems and some times they're very important, but in a lot of cases you can ignore them and use the super simple Hamiltonian we found.
$[a]$: Note that by working with $H/\hbar$, which has dimensions of frequency, all the $\hbar$'s on the right hand side are gone.
This is really convenient and it's what people are really doing when they say that they're "setting $\hbar$ to 1".
$[b]$: See this other Physics SE post for a demonstration of why.
As you have probably already worked out, the two eigenspaces that you're interested in, $N=2$ and $N=3$, are formed by a direct sum of subspaces of different angular momentum; thus, the $N=2$ eigenspace has one $l=2$ and one $l=0$ subspace, of dimensions $5+1=6$, and the $N=3$ eigenspace has one $l=3$ and one $l=1$ subspace, of dimensions $7+3=10$. You therefore have two distinct tasks: moving around within each subspace, and jumping to a different representation.
The first task is relatively easy, and in fact you do not need much knowledge of the internal structure of the hamiltonian to do this. You already know that $H$ commutes with $\mathbf L$, which guarantees you the shared eigenbasis $|nlm⟩$, but more than that you know that acting on the $|nlm⟩$ with components in the angular momentum, and particularly the ladder operators $L_\pm = L_x+iL_y$, will keep you inside that subspace. In this vein, for example, you can take the $|3,3,3⟩$ state you've already found to get
$$L_-|3,3,3⟩\propto |3,3,2⟩,$$
and so on.
What you really need to implement this in your language, though, is to bring the $L_\pm$ into the language of your creation and annihilation operators. We've already done most of the heavy lifting in the previous question, which gave us the identity
$$
L_i=-i\hbar\varepsilon_{ijk}a_j^\dagger a_k^\phantom{\dagger},
$$
and now we just need to apply this to the ladder operators:
\begin{align}
L_\pm
& = L_x \pm i L_y
\\ & =
-i\hbar\varepsilon_{1jk}a_j^\dagger a_k^\phantom{\dagger} \pm \hbar\varepsilon_{2jk}a_j^\dagger a_k^\phantom{\dagger}
\\ & =
-i\hbar(a_2^\dagger a_3^\phantom{\dagger} -a_3^\dagger a_2^\phantom{\dagger} )\pm \hbar(a_3^\dagger a_1^\phantom{\dagger}-a_1^\dagger a_3^\phantom{\dagger})
\\ & = \hbar(\mp a_1^\dagger-ia_2^\dagger)a_3^\phantom{\dagger} + \hbar a_3^\dagger(\pm a_1^\phantom{\dagger}+ia_2^\phantom{\dagger})
\\ & = \sqrt{2}\hbar(a_\pm^\dagger a_0^\phantom{\dagger} + a_0^\dagger a_\mp^\phantom{\dagger}),
\end{align}
where I've used the forms $a_\pm = \frac{1}{\sqrt{2}}(\mp a_1 +ia_2)$ and $a_\pm^\dagger =\frac{1}{\sqrt{2}}(\mp a_1^\dagger -ia_2^\dagger) $. This result for $L_\pm$ can be understood rather easily: you destroy one quantum with $a_0$ and then recreate it with $\pm$more angular momentum, or you destroy one quantum with $a_\mp$ and then recreate it with $\pm1$ more unit of angular momentum. Easy!
As applied to your $N=3$ state, $|3,3,3⟩ = \frac{1}{\sqrt{3!}}(a_+^\dagger)^3|0⟩$, you can then proceed to get
\begin{align}
|3,3,2⟩
&=
\frac{1}{\sqrt{6}}L_- |3,3,3⟩
\\ & =
\frac{1}{\sqrt{6}} \, \sqrt{2} \hbar(a_-^\dagger a_0^\phantom{\dagger} + a_0^\dagger a_+^\phantom{\dagger}) \ \frac{1}{\sqrt{3!}}(a_+^\dagger)^3|0⟩
\\ & =
\frac{1}{3\sqrt{2}}\hbar(0 + a_0^\dagger a_+^\phantom{\dagger}) \cdot (a_+^\dagger)^3|0⟩
\\ & =
\frac{1}{3\sqrt{2}}\hbar\, a_0^\dagger \left((a_+^\dagger)^3 a_+^\phantom{\dagger}+3(a_+^\dagger)^2\right) |0⟩
\\ & =
\frac{1}{\sqrt{2!}} \hbar \, a_0^\dagger (a_+^\dagger)^2 |0⟩,
\end{align}
and so on, until you reach $|3,3,-3⟩ = \frac{1}{\sqrt{3!}}(a_-^\dagger)^3|0⟩$.
The other subspace, with $l<N$, is a bit more tricky, because you cannot get it from the naive argument of just doing $(a_+^\dagger)^{N} |0⟩$ and then using the angular momentum algebra to cover you.
I always find the emergence of these shells to be easier to understand by looking at the way the cartesian components combine, so let me start by building the $N=2$, $l=0$ state directly on the cartesian basis $|n_x,n_y,n_z⟩_\mathrm{C}$. In particular, consider the state
$$
|\psi⟩ = \frac{1}{\sqrt{3}} \left( |2,0,0⟩_\mathrm{C} + |0,2,0⟩_\mathrm{C} + |0,0,2⟩_\mathrm{C}\right),
$$
which has a position-space representation
\begin{align}
⟨x,y,z|\psi⟩
& = \frac{1}{\sqrt{3}} \left( ⟨x,y,z|2,0,0⟩_\mathrm{C} + ⟨x,y,z|0,2,0⟩_\mathrm{C} + ⟨x,y,z|0,0,2⟩_\mathrm{C}\right)
\\ & = \frac{1}{\sqrt{3}} \bigg[ H_2(x)+H_2(y)+H_2(z)\bigg]e^{-r^2/2}
\\ & \propto \frac{1}{\sqrt{3}} \bigg[ (4x^2-2)+(4y^2-2)+(4z^2-2)\bigg]e^{-r^2/2}
\\ & = \frac{1}{\sqrt{3}} \left(4r^2-6\right)e^{-r^2/2}
,
\end{align}
so it depends only on $r$ and therefore belongs to the $l=0$ representation. This is all well and good, but how do we represent this using the bosonic operators? This can be read directly from the state above, by rephrasing it as
\begin{align}
|2,0,0⟩
& =
\frac{1}{\sqrt{3!}}\left((a_1^\dagger)^2+(a_2^\dagger)^2+(a_3^\dagger)^2\right)|0⟩
\\ & =
\frac{1}{\sqrt{3!}}\left(\frac12(-a_+^\dagger +a_-^\dagger)^2-\frac12(a_+^\dagger +a_-^\dagger)^2+(a_0^\dagger)^2\right)|0⟩
\\ & =
\frac{1}{\sqrt{3!}}\left((a_0^\dagger)^2-2a_+^\dagger a_-^\dagger\right)|0⟩
\end{align}
where I've used $a_1^\dagger = \frac{1}{\sqrt{2}}\left(-a_+^\dagger +a_-^\dagger\right)$ and $a_2^\dagger = \frac{i}{\sqrt{2}}\left(a_+^\dagger +a_-^\dagger\right)$. This makes a lot of sense: $|2,0,0⟩$ is generated up from the vacuum state through two pathways that increase $N$ by two but leave $L_0$ unchanged, $(a_0^\dagger)^2$ and $a_+^\dagger a_-^\dagger$, and then those two are superposed in a way that will make $L^2$ vanish. There is an obvious question, though - what is that operator, exactly, and how do we generalize it?
The answer here is that the above example is illustrative, but it is not quite the right approach, yet. We don't really want operators that will take us from one energy eigenspace to another, because those will not commute with the hamiltonian and will therefore have some relatively complicated algebraic properties. Instead, what we really want is a way to generate the $|2,0,0⟩$ state by jumping down the $l$ ladder from the $N=2$, $l=2$ space: if we do this, then our new class of ladder operators can commute with the hamiltonian (but not with $\mathbf L$).
This gives you a couple of clear candidates, because the isotropic oscillator has very few independent symmetries: the angular momentum, which we've already used, the Laplace-Runge-Lenz vector, and something called the Fradkin tensor, which is defined as
$$
F_{ij} = p_ip_j +x_ix_j
$$
in the cartesian frame, and which turns out to commute with the hamiltonian (which you should check explicitly). The Fradkin tensor tends to play nicer with the harmonic oscillator than the Runge-Lenz vector, and I've tried to make it mesh below but to be honest it's not quite there so you're going to need to fill in some blanks.
To bring this over to our old language, let's start by changing the quadratures over into bosonic operators, giving us
\begin{align}
F_{ij}
& = p_ip_j +x_ix_j
\\ & =
\frac{a_i^\phantom{\dagger}-a_i^\dagger}{\sqrt{2}\,i} \frac{a_j^\phantom{\dagger}-a_j^\dagger}{\sqrt{2}\, i} + \frac{a_i^\phantom{\dagger}+a_i^\dagger}{\sqrt{2}} \frac{a_j^\phantom{\dagger}+a_j^\dagger}{\sqrt{2}}
\\ & =
\frac12 \left[-(a_i^\phantom{\dagger}-a_i^\dagger)(a_j^\phantom{\dagger}-a_j^\dagger)+(a_i^\phantom{\dagger}+a_i^\dagger)(a_j^\phantom{\dagger}+a_j^\dagger)\right]
\\ & =
a_i^\phantom{\dagger}a_j^\dagger+a_i^\dagger a_j^\phantom{\dagger}
.
\end{align}
(Exercise: is this explicitly hermitian? If not, why is it hermitian?) This is, of course, on a cartesian basis for the tensor part, but of we want to be talking about spherical tensors here, so we really want to be using its spherical components, $F_{0,0}$ and $F_{2,m}$; here the scalar component is easy, since
$$
F_{0,0} = \sum_{i=1}^3 F_{ii} = 2H
$$
is just the hamiltonian, and to get the quadrupole components the easiest route is by analogy with the electric quadrupole $Q_{ij}=x_ix_j$, so you get
\begin{align}
F_{2,2}
&= F_{11}+2iF_{12}-F_{22}
\\ &= (a_1^\phantom{\dagger}a_1^\dagger+a_1^\dagger a_1^\phantom{\dagger}) +2i (a_1^\phantom{\dagger}a_2^\dagger+a_1^\dagger a_2^\phantom{\dagger}) - (a_2^\phantom{\dagger}a_2^\dagger+a_2^\dagger a_2^\phantom{\dagger})
\\ & = (a_1^\phantom{\dagger} + i a_2^\phantom{\dagger})(a_1^\dagger+ia_2^\dagger)
+(a_1^\dagger+ia_2^\dagger)(a_1^\phantom{\dagger} + i a_2^\phantom{\dagger})
\\ & = -4\,a_+^\dagger a_-^\phantom{\dagger},
\\[2mm]
F_{2,1}
&= F_{13}+iF_{23}
\\ & =
(a_1^\phantom{\dagger}a_3^\dagger+a_1^\dagger a_3^\phantom{\dagger}) +i(a_2^\phantom{\dagger}a_3^\dagger+a_2^\dagger a_3^\phantom{\dagger})
\\ & =
(a_1^\phantom{\dagger}+ia_2^\phantom{\dagger})a_3^\dagger+(a_1^\dagger +ia_2^\dagger )a_3^\phantom{\dagger}
\\ & =
\sqrt{2}\left(a_-^\phantom{\dagger}a_0^\dagger-a_+^\dagger a_0^\phantom{\dagger} \right),
\\[2mm]
F_{2,0}
& = 2F_{33}-F_{11}-F_{22}
\\ & =
2(a_3^\phantom{\dagger}a_3^\dagger+a_3^\dagger a_3^\phantom{\dagger}) -(a_1^\phantom{\dagger}a_1^\dagger+a_1^\dagger a_1^\phantom{\dagger}) -(a_2^\phantom{\dagger}a_2^\dagger+a_2^\dagger a_2^\phantom{\dagger})
\\ & =
2(a_0^\phantom{\dagger}a_0^\dagger+a_0^\dagger a_0^\phantom{\dagger}) -(a_+^\phantom{\dagger}a_+^\dagger+a_+^\dagger a_+^\phantom{\dagger}) - (a_-^\phantom{\dagger}a_-^\dagger+a_-^\dagger a_-^\phantom{\dagger})
\\ & =
2a_0^\dagger a_0^\phantom{\dagger} -a_+^\dagger a_+^\phantom{\dagger} - a_-^\dagger a_-^\phantom{\dagger}
,
\\[2mm] \text{and similarly}
\\[2mm]
F_{2,-1}
&= F_{13}-iF_{23}
\\ & =
(a_1^\phantom{\dagger}a_3^\dagger+a_1^\dagger a_3^\phantom{\dagger}) -i(a_2^\phantom{\dagger}a_3^\dagger+a_2^\dagger a_3^\phantom{\dagger})
\\ & =
(a_1^\phantom{\dagger}-ia_2^\phantom{\dagger})a_3^\dagger+(a_1^\dagger -ia_2^\dagger )a_3^\phantom{\dagger}
\\ & =
\sqrt{2}\left(a_-^\dagger a_0^\phantom{\dagger}-a_+^\phantom{\dagger}a_0^\dagger\right)
\quad\text{and}
\\[2mm]
F_{2,-2}
&= F_{11}-2iF_{12}-F_{22}
\\ &= (a_1^\phantom{\dagger}a_1^\dagger+a_1^\dagger a_1^\phantom{\dagger}) -2i (a_1^\phantom{\dagger}a_2^\dagger+a_1^\dagger a_2^\phantom{\dagger}) - (a_2^\phantom{\dagger}a_2^\dagger+a_2^\dagger a_2^\phantom{\dagger})
\\ & = (a_1^\phantom{\dagger} - i a_2^\phantom{\dagger})(a_1^\dagger-ia_2^\dagger)
+(a_1^\dagger-ia_2^\dagger)(a_1^\phantom{\dagger} - i a_2^\phantom{\dagger})
\\ & = -4\,a_-^\dagger a_+^\phantom{\dagger}.
\end{align}
These operators obviously commute with the hamiltonian, and while not hermitian they obey $F_{2,m}^\dagger = F_{2,-m}$.
This is as far as I'm going to take it for now, as I'm out of time. However, this should give you a good flavourful of where to take this to complete all the relationships in the relevant algebras; the correct ladder operators are there somewhere and it should just be a matter of whittling things down to get the correct relationships.
Best Answer
Previous answer completely rewritten
It seems to me that your hypothesis is true, up to a constant correction: $$\sum_{\textstyle{\pi: \{1,2,\ldots,n\} \to \{\mathbb{I}, \sigma_x, \sigma_y, \sigma_z\} \atop \forall i \in \{x,y,z\}:\ \mathrm{card}(\pi^{-1}(\sigma_i))=n_i}} \!\!\bigotimes_{i=1}^n\ \ \pi(i) \cong \frac1{n_x! n_y! n_z!} \mathopen{:} \left( a^\dagger b + b^\dagger a \right)^{n_x} \left( - i a^\dagger b + i b^\dagger a \right)^{n_y} \left( a^\dagger a - b^\dagger b\right)^{n_z} \mathclose{:},$$ and that this can be proven using multivariate induction.
For generic $n_x, n_y, n_z$, let $$[\![n_x, n_y, n_z]\!]$$ symbolically denote the operator on the left-hand side, in the qubit picture.
The induction step may be based upon the observation $$[\![1,0,0]\!][\![n_x, n_y, n_z]\!] = (n_x+1)[\![n_x+1, n_y, n_z]\!] - i(n_y+1)[\![n_x, n_y+1, n_z-1]\!] + i(n_z+1)[\![n_x, n_y-1, n_z+1]\!] + (n-n_x-n_y-n_z+1)[\![n_x-1, n_y, n_z]\!].$$ (It's a matter of simple combinatorics to find the multipliers.) There are perfectly analogous relations for multiplying by $[\![0,1,0]\!]$ or $[\![0,0,1]\!]$ from the left, too. Defining the double square brackets to be zero when either of the components is negative, this set of relations holds universally and the sufficient set of anchor points is the Wigner representation of $[\![1,0,0]\!]$, $[\![0,1,0]\!]$, $[\![0,0,1]\!]$, where we prove the equivalence to the Fock picture formulas directly.
Now one would just prove either of the relations (arguing by symmetry) to hold for the right-hand side, using re-ordering of the products after a multiplication by $$[\![1,0,0]\!] \cong a^\dagger b + ab^\dagger$$ and $$n \cong a^\dagger a + b^\dagger b$$ from the left. It requires a certain amount of work but should be doable. (I tried but ran into some kind of a numerical error.)
With a future edit on my mind, I will try to finish the proof of the induction step to see if I am right about the additional factor.
Assuming the relation is true, I doubt that there is any more "closed form" than the expansion $$\sum_{j=0}^{n_x} \sum_{k=0}^{n_y} \sum_{l=0}^{n_z} \frac{(-1)^{k+n_z-l} i^{n_y}}{j!k!l!(n_x-j)!(n_y-k)!(n_z-l)!} (a^\dagger)^{j+k+l} a^{n_x+n_y-j-k+l} (b^\dagger)^{n_x+n_y+n_z-j-k-l} b^{j+k+n_z-l}$$ as this does not seem to have any factorisation except for the one employing the normal reordering brackets.