For ease of typing, define
$$\eqalign{
M &= \Sigma^{-1} \;\implies\; dM = -M\,d\Sigma\,M
}$$
Your gradient is correct, so let's start with that and find its differential.
$$\eqalign{
G &= -\tfrac{1}{2} (nM-MSM) \\
dG
&= -\tfrac{1}{2} (n\,dM-dM\,SM-MS\,dM) \\
&= +\tfrac{1}{2} (n\,M\,d\Sigma\,M-M\,d\Sigma\,M\,SM-MSM\,d\Sigma\,M) \\
&= +\tfrac{1}{2} (n\,M\,d\Sigma\,M-M\,d\Sigma\,(2G+nM)-(2G+nM)\,d\Sigma\,M) \\
&= -\tfrac{1}{2} (n\,M\,d\Sigma\,M+2M\,d\Sigma\,G+2G\,d\Sigma\,M) \\
}$$
At this point, we'd normally use vec/vech operations, but you don't want to do that.
So let's introduce the double-dot product between tensors
$$\eqalign{
A={\cal B}:C \;\implies\; A_{ij}= \sum_{k,l} {\cal B}_{ijkl}C_{kl} \\
}$$
Let's also introduce the 4th order isotropic tensor ${\cal E}$ with components
${\cal E}_{ijkl} = \delta_{ik}\delta_{jl}$
This tensor is the identity for the double-dot product, i.e.
$\;A:{\cal E}= A = {\cal E}:A$
Another useful property is untangling matrix products
$\implies A\,dX\,B = A{\cal E}B^T:dX$
Continuing from before
$$\eqalign{
dG &= -\tfrac{1}{2} \big(n\,M{\cal E}M+2M{\cal E}G+2G{\cal E}M\big):d\Sigma \\
{\cal H} = \frac{\partial G}{\partial \Sigma}
&= -\tfrac{1}{2} \big(n\,M{\cal E}M+2M{\cal E}G+2G{\cal E}M\big) \\
{\cal H}_{ijkl} = \frac{\partial G_{ij}}{\partial \Sigma_{kl}}
&= -\tfrac{1}{2} \big(n\,M_{ip}{\cal E}_{pjkq}M_{ql}
+ 2M_{ip}{\cal E}_{pjkq}G_{ql}
+ 2G_{ip}{\cal E}_{pjkq}M_{ql}\big) \\
&= -\tfrac{n}{2}M_{ik}M_{jl} -M_{ik}G_{jl} -G_{ik}M_{jl} \\
}$$
I think it looks better with the $G$'s but you can eliminate them in favor of $S,M,\pm$ signs, and more indices.
$$\eqalign{
{\cal H}_{ijkl}
&= \tfrac{1}{2} \big(
n\,M_{ik}M_{jl}
- M_{ik}M_{jp}S_{pq}M_{ql}
- M_{ip}S_{pq}M_{qk}M_{jl}\big)
\\
&= \tfrac{1}{2} \big(
n\,M_{ik}M_{jl}
- M_{ik}(MSM)_{jl}
- (MSM)_{ik}M_{jl}\big)
\\
}$$
Best Answer
Your equation is equivalent to solving the following equation for $P$
$$ A\,P + P\,A^\top = B. \tag{1} $$
Such equation also solves for the infinite time (controllability) Gramian, which can also be calculated with
$$ P = -\int_0^\infty e^{A\,t}\,B\,e^{A^\top t}\,dt. \tag{2} $$
This integral should also be of size $n \times n$. As time goes to infinity $e^{A\,t}$ is not well defined when $A$ is positive semi-definite. It can also be noted that in order to be able to solve for $P$ the spectra of $A$ and $-A$ need to be disjoint. In order words $A$ can't have an eigenvalue of zero, which would imply that $A$ should be positive definite instead of positive semi-definite. When multiplying both sides of $(1)$ by minus ones keeps it equivalent to the original problem. This is also equivalent to multiplying both $A$ and $B$ by minus one, which transforms $(2)$ into
$$ P = \int_0^\infty e^{-A\,t}\,B\,e^{-A^\top t}\,dt. \tag{3} $$
Now the term $e^{-A\,t}$ does stay bounded and converges to zero as time goes to infinity. The rate of how fast this integral converges should be roughly inversely proportional to the smallest eigenvalue of $A$, however any finite time numerical approximation of this integral can be used as a lower bound for $P$.