Can I simplify this to pick out of the summation only the mode that matches the transition energy $\hbar \omega_{21}$? This would be because photons are emitted/absorbed entirely anyway, so the energies have to match, and the photons which do not have the right energy simply won't interact with the atom
That's exactly right.
Let's see it in detail.
Slightly nicer notation
$\newcommand{\ket}[1]{\left \lvert #1 \right\rangle}$
First let's make the notation slightly more standard.
We have $\hat{\sigma} \equiv |2\rangle\langle1|$ so that in the $|2\rangle, |2\rangle$ basis, $\hat{\sigma}^\dagger \hat{\sigma}$ has matrix representation
$$\left( \begin{array}{cc}
0 & 0 \\ 0 & 1
\end{array} \right) \, .$$
We're always free to put in an energy offset so we can shift to
$$\left( \begin{array}{cc}
-1/2 & 0 \\ 0 & 1/2
\end{array} \right) \, .$$
which is conveniently written as $- \sigma_z / 2$.
Therefore, our system's Hamiltonian is$^{[a]}$
\begin{align}
H / \hbar
=& H_A/\hbar + H_R/\hbar + H_I/\hbar \\
=& -\frac{\omega_{21}}{2} \sigma_z +
\sum_k \omega_k (a_k^\dagger a_k + 1/2) +
\sum_k g_k (a^\dagger_k + a_k) \sigma_x \, .
\end{align}
Interaction picture
This Hamiltonian is kind of complicated because there are so many interaction terms.
We can enormously simplify it by using the so-called "interaction picture".
The idea is to "unwind" the evolution of the atom and fields caused by their own Hamiltonians.
In the interaction picture formalism, you get a new effective Hamiltonian and the operators pick up a time dependence.
You construct the propagators of the atom and fields without the coupling:
$$U_A(t) \equiv \exp \left( -i H_A t / \hbar \right) = \exp \left(i \omega_{21} \sigma_z t / 2 \right)$$
and
$$U_R = \prod_k \exp \left(-i \omega_k t (a_k + a_k^\dagger + 1/2) \right) \, .$$
Then you find the time dependent operators.
For each operator $T$, it's time dependent version is
$$V(t) = U^\dagger(t) T U(t)$$
where here $U(t)$ means $U_A(t) \otimes U_R(t)$.
The interaction picture Hamiltonian is only the interaction part $H_I$, but with the time dependent versions of the operators.$^{[b]}$
So, all we have to do is figure out what $\sigma_x$ and $a$ look like with this time dependence.
You can do the algebra and you'll find that for the atom
$$
U_A(t)^\dagger \sigma_x U_A(t) = \cos(\omega_{21}t)\sigma_x + \sin(\omega_{21}t) \sigma_y = e^{i \omega_{21} t}\sigma_+ + e^{-i \omega_{21} t}\sigma_-
$$
where $\sigma_+ \equiv \left \lvert 2 \right\rangle \left \langle 1 \right \rvert$ and $\sigma_- = \sigma_+^\dagger$.
You can also find that in the interaction picture $a_k \rightarrow a e^{-i \omega_k t}a$.
Putting it all together, in the interaction picture our Hamiltonian is
\begin{align}
H'/\hbar
=&
\sum_k g_k
(a_k^\dagger e^{i \omega_k t} + a_k e^{-i \omega_k t})
(\sigma_+ e^{i \omega_{21} t} + \sigma_- e^{-i \omega_{21}t}) \\
=& \sum_k g_k \left(
a_k^\dagger \sigma_+ e^{i (\omega_k + \omega_{21})t} +
a_k^\dagger \sigma_- e^{i (\omega_k - \omega_{21})t} +
a_k \sigma_+ e^{i (-\omega_k + \omega_{21})t} +
a_k \sigma_- e^{i (-\omega_k - \omega_{21})t}
\right) \, .
\end{align}
Rotating wave approximation
Of these four terms, two of them have fast oscillating time dependencies because the frequencies in the exponent are either both negative or both positive.
In a lot of cases these fast oscillating time terms contribute much less to the dynamics than the slower terms.
This is roughly because the integral of an oscillating function stays small, while the integral of a constant function gets large.
Anyway, you can often drop those terms, leaving
\begin{align}
H'/\hbar
=& \sum_k g_k \left(
a_k^\dagger \sigma_- e^{i (\omega_k - \omega_{21})t} +
a_k \sigma_+ e^{-i (\omega_k - \omega_{21})t}
\right) \, .
\end{align}
Doing this is called the "rotating wave approximation.
Now we've got a sum over $k$ where each term has a frequency $\omega_k - \omega_{21}$.
Again, if the modes are spaced out, then you can have a situation where most of these frequencies are large, but there could be one radiation mode $l$ resonant with the atom, i.e. $\omega_l \approx \omega_{21}$.
Then, that term dominates and we get
$$H'/\hbar \approx g_l \left(
a_l^\dagger \sigma_- e^{i (\omega_l - \omega_{21})t} +
a_l \sigma_+ e^{-i (\omega_l - \omega_{21})t}
\right) \, .$$
If the frequencies are exactly the same (i.e. the atom and radiation mode $l$ are exactly on resonance), then you get
$$H'/\hbar \approx g_l \left(
a_l^\dagger \sigma_- + a_l \sigma_+ \right) \, .$$
In the subspace of radiation Fock states $\ket{n}$ and $\ket{n+1}$, this Hamiltonian is just proportional to $\sigma_x$, i.e. the quantum of energy oscillates between the atom and the radiation field.
Review
Your intuition about the atom only interacting with radiation modes of the same frequency is correct.
The "rotating wave approximation" is the name given to the approximation where you drop all the radiation modes that aren't resonant with the atom (of course, this works for things other than atoms and radiation).
The terms we dropped are ones that do not conserve energy.
It turns out they actually do have a role in real systems and some times they're very important, but in a lot of cases you can ignore them and use the super simple Hamiltonian we found.
$[a]$: Note that by working with $H/\hbar$, which has dimensions of frequency, all the $\hbar$'s on the right hand side are gone.
This is really convenient and it's what people are really doing when they say that they're "setting $\hbar$ to 1".
$[b]$: See this other Physics SE post for a demonstration of why.
Best Answer
Thanks to a comment from KF Gauss, and by thinking of the comparison with the classical picture, I managed to work out what is happening.
$q(t)$ and $p(t)$ evolve according to a classical harmonic oscillator Hamiltonian, so will evolve sinusoidally, out of phase with each other by $\pi/2$. Assuming q(t) is initially at its maximum value, $$q(t)=A\cos(\omega_{n}t),\;\;p(t)=-\omega_{n}A\cos(\omega_{n}t+\pi/2)$$ If we let $$\textbf{E}(t)=E(t)\hat{\textbf{e}}_{x},\textbf{B}(t)=B(t)\hat{\textbf{e}}_{y}$$ Then we see that $$E(t)=-\sqrt{\frac{2}{\epsilon_{0}V}}p(t)\sin(k_{n}z)$$ and $$B(t)=\sqrt{\frac{2}{\epsilon_{0}V}}k_{n}q(t)\cos(k_{n}z)$$ We see that q(t) and p(t) represent time varying amplitudes that cause the amplitude of the electric and magnetic fields to vary with time, thus causing the electric field to form standing wave solutions, as demonstrated for $n=5$ here, where $z$ is in units of $L$ the centre of the dark red bands are where $\sin(k_{n}z)=1$ hence $E(t)=-\frac{2}{\epsilon_{0}V}q(t)$ and the centre of the dark blue bands are where $\sin(k_{n}z)=-1$ hence $E(t)=\frac{2}{\epsilon_{0}V}q(t)$.
So after quantisation, $$\hat{E}=-\sqrt{\frac{2}{\epsilon_{0}V}}\sin(k_{n}z)\hat{P}$$ and $$\hat{B}=\sqrt{\frac{2}{\epsilon_{0}V}}k_{n}\cos(k_{n}z)\hat{Q}$$ that is, up to some $z$-dependant constant, the quadrature operators are equivalent to the field operators. Finally, returning to this gif, we realise that this isn't a wavepacket of light bouncing back and forth between the mirrors, because $\hat{Q}$ has no relation to position. Taking the expectation value of the electric field of a coherent state, $$\langle\alpha e^{i\omega_{n} t}|\hat{E}|\alpha e^{i\omega _{n}t}\rangle=-\sqrt{\frac{2}{\epsilon_{0}V}}\sin(k_{n}z)\langle\alpha e^{i\omega_{n} t}|\hat{P}|\alpha e^{i\omega_{n} t}\rangle=\sqrt{\frac{4\omega_{n}\hbar}{\epsilon_{0}V}}|\alpha|\sin(k_{n}z)\cos(\omega_{n} t-\theta-\pi/2)$$ for $\alpha=re^{i\theta}$ we see that the amplitude of a coherent state oscillates in time, returning the classical standing wave image, where a real valued $\alpha$ corresponds to the assumption that $\langle E(t)\rangle$ is at the maximum value of its oscillation at $t=0$.
So in conclusion, my confusion was mistaking this oscillation in the gif as oscillation of a light wavepacket between the mirrors of the cavity, when it in fact (in the context i'm considering) represents the oscillation of the amplitude of the semi-classical electric field. Finally, the quadrature operators are equivalent to the field operators up to a constant.