Yes this is true, formally this follows by the envelope theorem. In an abstract and very smooth setting, the envelope theorem says that for an objective functional depending on a parameter $t$
$$
F(t)=\max\limits_z f(t,z),
$$
then the derivative of the optimal value can be computed as
$$
\frac{dF}{dt}(t)=\partial_t f(t,z_t)
\qquad
\mbox{for any smooth selection of a maximizer }z_t \mbox{ of }F(t).
$$
This can be seen easily: for any such choice of a maximizer, just apply a chain rule and use the optimality condition of $z_t$ in the maximization problem for fixed $t$:
$$
F'(t)=\frac d{dt}f(t,z_t)=\partial_t f(t,z_t)+\underbrace{\partial_zf(t,z_t)}_{=0}\frac {dz_t}{dt}.
$$
This means, roughly speaking, that one can simply forget that the minimizer varies, only the variation of the functional matter.
In your specific context, you are trying to differentiate (w.r.t $\rho$) the optimal value of the optimization problem given by the Kantorovich dual formulation
$$
W^2(\rho,\eta)
=F_\eta(\rho)
=\max\limits_\phi f_\eta(\rho,\phi)
=\max\limits_\phi \left\{\int \rho\phi+\int\eta\phi^c\right\}
$$
(here $\eta$ is fixed once and for all, I'm mimicking my $F,f$ notations above to give some perspective and I hope the notation is sufficiently self-explanatory).
Although the Kantorovich potential $\phi$ from $\rho$ to $\eta$ (the optimizer) varies when $\rho$ varies, the envelope theorem strongly suggests that you can actually argue as it did not vary at leading order (same for its $c$-transform $\phi^c$), and one can simply differentiate the functional w.r.t. the varying "parameter" $\rho$. Since the Kantorovich functional is linear in $\rho$, the conclusion is indeed that the first variation is given by $ \frac{\partial f_\eta}{\partial_\rho}(\rho,\phi)=\phi$.
Of course various subtle problems may arise owing essentially to the infinite-dimensional setting and functional-analytic details, but this is the rough idea.
For a completely rigorous statement and proof I can recommend Filippo Santambrogio's book [1], in particular chapter 7 and Proposition 7.17
[1] Santambrogio, Filippo. "Optimal transport for applied mathematicians." Birkäuser, NY 55.58-63 (2015): 94.
$\newcommand{\ep}{\varepsilon}\newcommand\R{\mathbb R}$Yes, the minimizer $\mu$ is absolutely continuous (w.r. to the Lebesgue measure $|\cdot|$).
Indeed, you showed that
\begin{equation*}
F(\rho_n)\to m:=\inf_\rho F(\rho) \tag{-1}
\end{equation*}
and
\begin{equation*}
\mu_{\rho_n}\to\mu \tag{-0.5}
\end{equation*}
weakly for some sequence $(\rho_n)$ of probability densities and some probability measure $\mu$, where
\begin{equation*}
F(\rho):=\int\rho\ln\rho\,dx+W_2^2(\rho,\eta) \tag{0}
\end{equation*}
and $\mu_\rho(dx):=\rho(x)\,dx$.
Take any set $E\subseteq\R$ with $|E|=0$. We have to show that then $\mu(E)=0$.
Take any real $\ep>0$. By the regularity of the Lebesgue measure, there is an open set $G_\ep\subset\R$ such that
\begin{equation*}
\text{$E\subseteq G_\ep$ and $|G_\ep|<\ep$.} \tag{0.5}
\end{equation*}
By (-0.5) and the Portmanteau theorem,
\begin{equation*}
\mu(G_\ep)\le\liminf_n\mu_{\rho_n}(G_\ep). \tag{1}
\end{equation*}
Next, for each real $a>1$,
\begin{equation*}
\mu_{\rho_n}(G_\ep)=K_n+L_n, \tag{2}
\end{equation*}
where
\begin{equation*}
K_n:=\int_{G_\ep\cap[\rho_n\le a]}\rho_n\,dx,\quad L_n:=\int_{G_\ep\cap[\rho_n>a]}\rho_n\,dx,
\end{equation*}
$[\rho_n\le a]:=\rho_n^{-1}((-\infty,a])$, $[\rho_n>a]:=\rho_n^{-1}((a,\infty))$.
Further,
\begin{equation*}
K_n\le a|G_\ep|<a\ep \tag{3}
\end{equation*}
by (0.5), and
\begin{equation*}
L_n\le \int_{[\rho_n>a]}\rho_n\,dx
\le\frac1{\ln a}\int \rho_n\ln\rho_n\,dx\le \frac{m+1}{\ln a} \tag{4}
\end{equation*}
for all large enough $n$, by (-1) and (0).
By (0.5), (1), (2), (3), (4),
\begin{equation*}
\mu(E)\le\mu(G_\ep)\le a\ep+\frac{m+1}{\ln a},
\end{equation*}
for all real $\ep>0$ and all real $a>1$. Letting now $\ep\downarrow0$ and then $a\to\infty$, we get $\mu(E)=0$, as desired.
Best Answer
$\newcommand{\R}{\mathbb R}\newcommand{\H}{\mathcal H}$Building up on previous comments, and slighlty elaborating. The answer to your question, as is, is NO. Recall that convergence in the Wasserstein distance $W_2$ is equivalent to weak-* convergence (duality with continuous bounded functions) and convergence of the second moments (or equivalently, with all continuous functions growing at most quadratically at infinity). As a consequence, in order for your compactness to hold you would need to control the second moments in sublevelsets. However, the entropy is translation-invariant, so mass can escape at infinity in a sublevelset $\{\H\leq C\}$. More precisely, for a counterexample take as in @Kostya_I 's comment any measure $\rho$ with finite entropy $\H(\rho)=C<\infty$, take any $x_n\in\R^d$ such that $|x_n|\to\infty$, and consider the translated sequence $\rho_n=\rho(\cdot-x_n)$. Then obviously $\H(\rho_n)=\H(\rho)=C$ for any $n$, but $\rho_n$ fails to converge in the Wasserstein distance (since for example $\rho_n$ converges weakly to zero when tested with respect to any compactly supported function).
Since the problem here is only due mass escaping at infinity, for your desired compactness to hold you only need a little bit of confinement. For example it would suffice to control the second moment $\mathfrak m_2(\rho)=\int|x|^2\rho(dx)$, for example by adding a potential energy $$ \mathcal F(\rho)=\H(\rho)+\int V(x)\rho(dx) $$ for some potential $V(x)\gtrsim C|x|^2$ growing at least quadratically at infinity. For this you can use the handy Carleman estimate $$ \H(\rho)\geq -C_\alpha(1+\mathfrak{m}_2(\rho))^\alpha, \qquad \alpha\in \big(d/(d+2),1\big). $$ Check the original JKO paper [1], proof of proposition 4.1 (eqs. 14 and 15 therein). Actually you can replace the potential energy $\int V\rho$ by anything controlling $\mathfrak m_2(\rho)^{\frac{d}{d+2}}$ in order to retrive control over the moments from energy bounds $\mathcal F(\rho)\leq C\Rightarrow \mathfrak m_2(\rho)\leq M(C)$.
Connecting with @pseudocydonia 's answer above: another way to interpret the "potential energy above" would be to consider instead the relative entropy $\H_\mu(\rho)=\int \frac{\rho}{\mu}\log\left(\frac{\rho}{\mu}\right)d\mu$ for a log-quadratic reference measure $\mu\sim e^{-V}$ with $V\gtrsim C|x|^2$.
Side note: in your setting you also have a slightly worse problem than what you may suspect. When computed relatively to the Lebesgue measure on (sufficiently wide) unbounded sets the entropy is actually not bounded from below on $\mathcal P_{ac}^2$.
[1] Jordan, R., Kinderlehrer, D., & Otto, F. (1998). The variational formulation of the Fokker--Planck equation. SIAM journal on mathematical analysis, 29(1), 1-17.