The Kullback-Leibler divergence $D_{\rm KL}(Q||P)$ of two distributions $Q,P$ has been generalized to multiple distributions in various ways:
[1] information radius: $R(P_1,\ldots P_k)=\frac{1}{k}\sum_{i=1}^k D_{\rm KL}(P_i||k^{-1}\sum_i P_i)$
[2] average divergence: $K(P_1,\ldots P_k)=\frac{1}{k(k-1)}\sum_{i,j=1}^k D_{\rm KL}(P_i||P_j)$
[3,4] dissimilarity: the weighted arithmetic mean of the KL distances between each of the $P_i$’s and the barycenter of all the $P_i$’s
References
[1] Robin Sibson, Information radius. Probability Theory and Related Fields, 14, 149–160 (1969).
[2] Andrea Sgarro, Informational divergence and the dissimilarity of probability distributions. Calcolo, 18, 293–302 (1981).
[3] Michèlle Basseville, Divergence measures for statistical data processing (2010).
[4] Darío García-García and Robert C. Williamson, Divergences and Risks for Multiclass Experiments (2012).
Assuming you want to minimize the Kullback–Leibler divergence
$$D(P\parallel Q)=\int dP\,\ln\frac{dP}{dQ}$$
over all isotropic Gaussian $P$, "the" minimizer is in general not unique and, accordingly, not Lipschitz even on the set of measures $Q$ where it is unique.
The idea of a counterexample is quite simple: Suppose that $d=1$. Let $Q_h$ be the probability measure with pdf $q_h$ given by the formula
$$q_h(x)=c_h\big((1+h)\,f_{1,a}(x)\,1(x>0)
+(1-h)\,f_{-1,a}(x)\,1(x<0)\big)$$
for real $x$, where $f_{t,a}$ is the pdf of the normal distribution $N(t,a^2)$, $a>0$ is small enough (the condition $0<a<\sqrt{2/\pi}$ should do), $h$ is a real number very close to $0$, and $c_h(\approx1/2)$ is the normalizing factor.
Since $a$ is rather small, $Q_h$ is somewhat close to the mixture of the rather narrow normal distributions $N(1,a^2)$ and $N(-1,a^2)$ with slightly unequal weights, $c_h\,(1+h)$ and $c_h\,(1-h)$ respectively. So, a minimizer $P_h$ of the Kullback–Leibler divergence $D(P\parallel Q_h)$ in $P$ should be sufficiently close to $N(1,a^2)$ or $N(-1,a^2)$ depending on whether the small perturbation $h$ is $>0$ or $<0$, respectively. Thus, an infinitesimally small change from, say, $h>0$ to $-h<0$ will result in quite a nonnegligible change from $P_h\approx N(1,a^2)$ to $P_{-h}\approx N(-1,a^2)$. (If $h=0$, then there will be two minimizers.)
I can write down details later, if you want them.
Responding to the comment of the OP about a possible relation of your question to a result by Csiszár: Your question concerns the existence and uniqueness, for any given probability measure (PM) $Q$, of a PM $P_{\mathcal S,Q}\in\mathcal S$ such that $D(P_{\mathcal S,Q}\parallel Q)\le D(P\parallel Q)$ for all $P\in\mathcal S:=\mathcal P$, which latter is the set of all isotropic Gaussian PM's.
In contrast, Csiszár's result is that for any given PM $P$ (rather than $Q$) there is a unique PM $\tilde P^{\mathcal S,P}$ such that
$D(\tilde P^{\mathcal S,P}\parallel Q)+D(P\parallel\mathcal S)\le D(P\parallel Q)$ for all $Q\in\mathcal S$ (rather than $P\in\mathcal S$), where $D(P\parallel\mathcal S):=\inf_{Q\in\mathcal S}D(P\parallel Q)$. (So, this looks like some kind of Pythagorean inequality.)
A corollary to this result by Csiszár is that $D(\tilde P^{\mathcal S,P}\parallel Q)\le D(P\parallel Q)$ for all $Q\in\mathcal S$ (rather than $P\in\mathcal S$).
So, if $D$ were a metric, your question would be about the existence and uniqueness of a PM in $\mathcal S$ closest to $Q$. On the other hand, again if $D$ were a metric, the mentioned corollary from Csiszár's result would say that the length of the projection of the segment $\tilde P^{\mathcal S,P}Q$ of the segment $PQ$ onto $\mathcal S$ is no greater than the length of $PQ$, for any $Q$. The latter property would be equivalent to your "closest" property if $D$ were a Euclidean metric. But $D$ is not a metric at all. So, Csiszár's result says something different from what your question is about.
(The comparison of your question to Csiszár's result got more complicated than necessary because you interchanged the usual order of the arguments $P$ and $Q$ of $D(P\parallel Q)$, which was also used by Csiszár. So, I have edited your post, and mine, accordingly.)
Best Answer
$\newcommand\de\delta\newcommand{\KL}{\operatorname{KL}}\newcommand{\p}{\,\|\,}$The maps $$\mu\mapsto\sqrt{\KL(\mu\p\nu)}$$ and $$\nu\mapsto\sqrt{\KL(\mu\p\nu)}$$ are not convex in general.
Indeed, let $\mu_p:=p\de_0+(1-p)\de_1$, where $p\in(0,1)$ and $\de_a$ is the Dirac measure supported on $\{a\}$.
Then the second partial derivative with respect to $p$ of $\sqrt{\mathrm{KL}(\mu_p\p\mu_r)}$ at $(p,r)=(1/10,1/11)$ is $-7.17\ldots<0$. So, $\sqrt{\mathrm{KL}(\mu\p\mu_r)}$ is not convex in $\mu$.
Also, the second partial derivative with respect to $r$ of $\sqrt{\mathrm{KL}(\mu_p\p\mu_r)}$ at $(p,r)=(1/10,1/9)$ is $-11.50\ldots<0$. So, $\sqrt{\mathrm{KL}(\mu_p\p\nu)}$ is not convex in $\nu$.
You also asked: "If this is not true in general, does it exists a measure $\mu\in P(X)$ such that $\nu\mapsto \sqrt{\mathrm{KL}(\mu\p\nu)}$ is a convex map $P(X)\to \mathbb [0,+\infty]$? Or a measure $\nu$ such that $\mu\mapsto \sqrt{\mathrm{KL}(\mu\p\nu)}$ is convex?"
The answer to each of these two questions is yes, at least when $X=\{0,1\}$, say.
For $p\in(0,1)$, let
\begin{equation} F(p):=\sqrt{\mathrm{KL}(\mu_p\p\mu_{1/2})}, \end{equation} \begin{equation} f(p):=F''(p)4 \mathrm{KL}(\mu_p\p\mu_{1/2})^{3/2}, \end{equation} \begin{equation} f_1(p):=f'(p)(1-p)^2 p^2. \end{equation} Then $f_1(1/2)=f'_1(1/2)=f''_1(1/2)=0$ and \begin{equation} f'''_1(p)=\frac{2+4 p(1-p)}{(1-p)^2 p^2}>0. \end{equation} It follows that $F''(p)\ge0$, so that $\sqrt{\mathrm{KL}(\mu\p\mu_{1/2})}$ is convex in $\mu$.
For $r\in(0,1)$, let
\begin{equation} G(r):=\sqrt{\mathrm{KL}(\mu_{1/2}\p\mu_r)}, \end{equation} and \begin{equation} g(r):=G''(r)4\mathrm{KL}(\mu_{1/2}\p\mu_r)^{3/2}. \end{equation} Then $g(1/2)=g'(1/2)=g''(1/2)=g'''(1/2)=0$ and \begin{equation} g''''(1/2+h)\frac{(1 - 4 h^2)^4}{16}=9- 16 h^4 + 156 h^2 + 64 h^6 \\ >9- 1 + 156 h^2 + 64 h^6>0 \end{equation} if $|h|<1/2$. It follows that $G''(r)\ge0$, so that $\sqrt{\KL(\mu_{1/2}\p\nu)}$ is convex in $\nu$.
Remark 1: The existence of a probability measure $\nu\in P(X)$ such that $\sqrt{\KL(\mu\p\nu)}$ is convex in $\nu$ holds for any Polish space $X$. Indeed, the case when $X$ is a singleton set is trivial. Otherwise, take any distinct points $x$ and $y$ in $X$ and let $\nu:=\frac12\,\de_x+\frac12\,\de_y$. Then the condition $\mu\ll\nu$ implies $\mu=p\,\de_x+(1-p)\,\de_y$ for some $p\in[0,1]$, so that here $\sqrt{\KL(\mu\p\nu)}=F(p)$.
Remark 2: Let us say that a probability measure $\mu\in P(X)$ is good if $\sqrt{\KL(\mu\p\nu)}$ is convex in $\nu$. Then it is easy to see that the Dirac measure $\de_x$ is good for any $x\in X$. It also follows from above that $\mu:=\frac12\,\de_x+\frac12\,\de_y$ is good if $X=\{x,y\}$ for some $x,y$.
Finally, using reasoning similar to that above, one can show that, in the case when the cardinality of $X$ is $\ge3$, the only good probability measures $\mu\in P(X)$ are the Dirac measures. Since this answer has already grown rather long, I will leave the latter assertion as an exercise to interested readers.
Now the answer is quite complete.