Replacing covariants with partials:
The source equation you cite involves a skew-symmetric tensor density.
You may know that if $\nabla$ is the Levi-Civita connection, you can calculate divergences of vector fields without using the connection coefficients/Christoffel symbols: $$ \nabla_\mu X^\mu=\frac{1}{\sqrt{-g}}\partial_\mu(X^\mu\sqrt{-g}). $$
Now, let $\mathcal{J}^\mu=X^\mu\sqrt{-g}$ be a vector density. Then $$ (\nabla_\mu X^\mu)\sqrt{-g}=\nabla_\mu(X^\mu\sqrt{-g})=\nabla_\mu\mathcal{J}^\mu=\partial_\mu(X^\mu\sqrt{-g})=\partial_\mu\mathcal{J}^\mu, $$ so vector density fields can be differentiated partially.
However, as it turns out the situation is the same for arbitrary (k,0)-type antisymmetric tensor fields: Let $F^{\mu\nu}$ be such a field. Then $$ \nabla_\nu F^{\mu\nu}=\partial_\nu F^{\mu\nu}+\Gamma^\mu_{\ \nu\sigma}F^{\sigma\nu}+\Gamma^{\nu}_{\ \nu\sigma}F^{\mu\sigma}, $$ but the second term here vanishes because $\Gamma$ is symmetric in the lower indices, but $F$ is skew-symmetric in the upper indices, so we're left with $$ \nabla_\nu F^{\mu\nu}=\partial_\nu F^{\mu\nu}+\Gamma^\nu_{\ \nu\sigma}F^{\mu\sigma}=\partial_\nu F^{\mu\nu}+\partial_\sigma\ln\sqrt{-g}F^{\mu\sigma} \\ =\frac{1}{\sqrt{-g}}\partial_\nu(F^{\mu\nu}\sqrt{-g}). $$
Defining $\mathcal{F}^{\mu\nu}=F^{\mu\nu}\sqrt{-g}$ gives $$ \nabla_\nu \mathcal{F}^{\mu\nu}=\partial_\nu \mathcal{F}^{\mu\nu}. $$
Dependence of Maxwell's equations on the (pseudo-)Riemannian structure:
Remember that the fundamental field here is $A_\mu$, which is a covector field. The only Maxwell equation that doesn't depend on the Riemannian structure is $\partial_{[\mu} F_{\nu\sigma]}=0$, because you can replace the covariants here with partials do to skew-symmetry.
Also do remember that $F_{\mu\nu}=\partial_\mu A_\nu -\partial_\nu A_\mu$ is also well-defined without covariants.
When we get to the source equation things change, however, because
- to define divergence we need upper indices, however $F$ is lower-indiced by nature, so we need the metric to raise indices;
- you can replace covariants with partials there only if you multiply with $\sqrt{-g}$ to create densities. Which, of course, depends on the metric.
So you can weasel out of using the connection (which is great for computations), but you cannot weasel out of using the metric, therefore Maxwell's equations are absolutely not topological.
An aside on differential forms:
You should read about differential forms. I am trying to think of reference that should be quick-readable by physicist. Probably Flanders' book is a good one. Otherwise Anthony Zee's General Relativity and QFT books also contain differential forms but only in a heuristic manner. Sean Carroll's GR book also has an OK recount of them.
Basically, differential forms are totally antisymmetric covariant tensor fields. Instead of using index notation as in, say $\omega_{\mu_1,...,\mu_k}$ to denote them, usually 'abstract' notation is used as $\omega=\sum_{\mu_1<...<\mu_k}\omega_{\mu_1...\mu_k}\mathrm{d}x^{\mu_1}\wedge...\wedge\mathrm{d}x^{\mu_k}$, where the basis is written out explicitly. The wedge symbols are skew-symmetric tensor products.
Differential forms are good because they generalize vector calculus to higher dimensions, arbitrary manifolds and also to cases when you don't have a metric. Differential forms can be differentiated ($\omega\mapsto\mathrm{d}\omega$), where the "$\mathrm{d}$" operator, called the exterior derivative, turns a $k$-form into a $k+1$ form without the need for a metric or a connection, and generalizes grad, div and curl, all in one.
The integral theorems of Green, Gauss and Stokes are also generalized.
The point is, if you also have a metric, the theory of differential forms is enriched. You get an option to turn $k$-forms into $n-k$ forms ($n$ is the dimension of your manifold), and also to define a "dual" operation to the exterior derivative, called the codifferential. The codifferential essentially brings the concept of divergence to differential forms.
Written with differential forms, Maxwell's equations are given by $$ \mathrm{d}F=0 \\ \mathrm{d}^\dagger F=kJ, $$ where $\mathrm{d}^\dagger$ is the codifferential, and $k$ is some constant I care not about right now.
I am noting two things:
- The $F$ field strength 2-form is given by $F=\mathrm{d}A$, where $A$ is ofc the 4-potential. The exterior derivative satisfies $\mathrm{d}^2=0$ (think of $\text{div}\ \text{curl}=0$ and $\text{curl}\ \text{grad}=0$), so with potentials, the first equation is $\mathrm{d}F=\mathrm{d}^2A=0$, which is trivially true.
- The first equation contains only $\mathrm{d}$, which is well-defined without a metric. The second equation depends on the codifferential $\mathrm{d}^\dagger$, which does depend on the metric. There is your metric dependance!
Formula (3) defines the canonical energy-momentum tensor of the field, which is defined only in flat Minkowski spacetime and is the Noether current associated to translation symmetry.
By contrast, formula (2) defines the Einstein-Hilbert energy momentum tensor, which is also defined in a general spacetime background, but is not a Noether current, at least not in the sense the canonical tensor is.
It is well-known that the canonical EM tensor often requires "improvement", whereas it is generally accepted that the Einstein-Hilbert EM tensor is the "correct" EM tensor of a given field.
Formula (4) would be valid if the EHEM tensor (Einstein-Hilbert energy momentum tensor) and the CEM tensor (canonical energy-momentum tensor) turned out to be always equivalent to one another. This is not true in general however.
This paper by Forger and Römer examines the relationship between these two tensors in great detail.
I will give a simpler analysis here. Suppose that the Lagrangian is $\mathcal L(\phi,\partial\phi,g)$ (the factor of $\sqrt{|\mathfrak g|}$ is considered part of the Lagrangian) and the field theory has some concept of diffeomorphism symmetry and the Lagrangian is diffeomorphism invariant. The Lie derivative of the $\phi=(\phi^i)$ field is written generally as $$ \delta_X\phi^i=X^\mu\phi^i_{,\mu}+Q^{i,\mu}{}_\nu(\phi,\partial\phi)X^\nu_{,\mu} $$for any vector field $X^\mu$. (For fields which are not in the representation of the diffeomorphism group, see the paper by Forger and Römer).
Under a diffeomorphism, the Lagrangian is a scalar density, thus $$ \delta_X\mathcal L=\partial_\mu(X^\mu\mathcal L)=\frac{1}{2}\mathfrak{T}^{\mu\nu}\delta_X g_{\mu\nu}+E_i\delta_X\phi^i+\partial_\mu(P^\mu_i\delta_X\phi^i), $$
where $$ \mathfrak T^{\mu\nu}:=2\frac{\delta \mathcal L}{\delta g_{\mu\nu}},\quad E_i =\frac{\delta\mathcal L}{\delta\phi^i},\quad P^\mu_i=\frac{\partial\mathcal L}{\partial\phi^i_{,\mu}},$$and finally, $\delta_Xg_{\mu\nu}=2\nabla_{[\mu}X_{\nu]}$ is the metric Lie derivative.
If we expand the Lie derivatives and integrate by parts on the vector field $X$ and rearrange the terms to $0$, we get $$ 0=[E_i\phi^i_{,\nu}-\partial_\mu(E_i Q^{i,\mu}{}_\nu)-\nabla_\mu\mathfrak T^{\mu}{}_\nu]X^\nu+\partial_\mu[\mathfrak T^\mu{}_\nu X^\nu+P^\mu_i\delta_X\phi^i+E_iQ^{i,\mu}{}_\nu X^\nu]. $$
If we integrate this and take $X^\mu$ to be compactly supported, we get that the first bracket $[...]$ must vanish identically. This gives $$ \nabla_\mu\mathfrak T^\mu{}_\nu=E_i\phi^i_{,\nu}-\partial_\mu(E_iQ^{i,\mu}{}_\nu) $$as an off-shell relation. If the field $\phi$ is on-shell, then this reduces to $\nabla_\mu\mathfrak T^\mu{}_\nu=0$. Let the vector density whose total divergence appears in the second bracket $\partial_\mu[...]$ be denoted $S^\mu_X$.
Then we also get that $\partial_\mu S^\mu_X=0$ for all possible choices of $X$ and $\phi$ and $g$. If we expand the Lie derivative in $S^\mu_X$ as well, we get $$S^\mu_X=(\mathfrak T^\mu{}_\nu+P^\mu_i\phi^i_{,\nu}-\mathcal L\delta^\mu_\nu+E_iQ^{i,\mu}{}_\nu)X^\nu+P^\mu_iQ^{i,\kappa}{}_\nu X^\nu_{,\kappa}. $$ It is then fairly easy to show that $\partial_\mu S^\mu_X=0$ for all choices of $X^\mu$ implies that $$ S^\mu_X=\partial_\rho K^{\mu\rho}_X,\quad K^{\mu\rho}_X=P^{[\mu}_i Q^{i,\rho]}{}_\nu X^\nu, $$ so $K^{\mu\rho}_X$ is skew-symmetric in $\mu\rho$, moreover $K^{\mu\rho}_X$ is unique (the higher order analogue of this proof is much more difficult and the uniqueness is lost!).
Now evaluate this entire relation ($S^\mu_X=\partial_\rho K^{\mu\rho}_X$) in flat spacetime with $X^\mu=\tau^\mu$ a constant translation vector! Moreover, evaluate it on-shell for the $\phi$ field, i.e. $E_i=0$. Then we get $$ S^\mu=(T^\mu{}_\nu+P^\mu_i\phi^i_{,\nu}-\mathcal L\delta^\mu_\nu)\tau^\nu=\partial_\rho(P^{[\mu}_i Q^{i,\rho]}{}_\nu)\tau^\nu, $$ and using that $\tau$ is any constant vector field, we get $$ T^\mu{}_\nu=\mathcal L\delta^\mu_\nu-P^\mu_i\phi^i_{,\nu}+\partial_\rho(P^{[\mu}_i Q^{i,\rho]}{}_\nu). $$
Here $T^\mu{}_\nu$ is the (flat spacetime limit of the) Einstein-Hilbert energy momentum tensor, the $\Theta^\mu{}_\nu:=\mathcal L\delta^\mu_\nu-P^\mu_i\phi^i_{,\nu}$ is the canonical energy momentum tensor, and the divergence terms are correction terms.
This is the correct relation at least under the assumptions made on the initial form of the Lagrangian.
Best Answer
Let us consider the action $$ S = \int_\mathcal{M} \text{d}^n x \sqrt{|g|} \mathscr{L}(\phi^\mu(x),\nabla_\alpha \phi^\mu(x)) $$ where $\mathcal{M}$ is a region of spacetime with a fixed background metric $g$, and vary $\phi^\mu$. To do this, we take $\phi^\mu$ and replace it with $\phi^\mu + \delta \phi^\mu$, and then discard any terms that are higher order than $\mathcal{O}(\delta \phi^\mu)$ to find $\delta S$. The requirement that $\delta S = 0$ for arbitrary $\delta \phi^\mu$ will then allow us to find the Euler-Lagrange equations.
The variation of $\delta \phi^\mu$ yields $$ S + \delta S = \int_\mathcal{M} \text{d}^n x \sqrt{|g|} \mathscr{L}(\phi^\mu(x)+ \delta \phi^\mu,\nabla_\alpha \phi^\mu(x) + \nabla_\alpha \delta \phi^\mu) $$ and so the first-order variation $\delta S$ can be seen to be $$ \delta S = \int_\mathcal{M} \text{d}^n x \sqrt{|g|} \left[ \frac{\partial \mathscr{L}}{\partial \phi^\mu} \delta \phi^\mu + \frac{\partial \mathscr{L}}{\partial (\nabla_\alpha \phi^\mu)} \nabla_\alpha \delta \phi^\mu \right] $$
We now need to integrate by parts, as we would in flat spacetime. In the context of a curved background metric, Gauss's theorem is $$ \int_\mathcal{M} \text{d}^n x \sqrt{|g|} \nabla_\alpha A^\alpha = \oint_{\partial \mathcal{M}} \text{d}^{n-1} x \sqrt{|h|} n_\alpha A^\alpha $$ where $h_{\mu \nu}$ is the induced metric on the boundary $\partial \mathcal{M}$ and $n_\alpha$ is the unit normal on $\partial \mathcal{M}$.1 (See Appendix B of Wald's General Relativity or Chapter 3 of Poisson's A Relativist's Toolkit for a derivation.) In the present case, this means that we have $$ \delta S = \int_\mathcal{M} \text{d}^n x \sqrt{|g|} \left[ \frac{\partial \mathscr{L}}{\partial \phi^\mu} - \nabla_\alpha \frac{\partial \mathscr{L}}{\partial (\nabla_\alpha \phi^\mu)} \right] \delta \phi^\mu + \oint_{\partial \mathcal{M}} \text{d}^{n-1} x \sqrt{|h|} n_\alpha \frac{\partial \mathscr{L}}{\partial (\nabla_\alpha \phi^\mu)} \delta \phi^\mu. $$ If we are considering field variations $\delta \phi^\mu$ that vanish on the boundary but are otherwise arbitrary,2 then the boundary term vanishes and the quantity in brackets in the bulk integrand must vanish, leaving us with the result $$ \frac{\partial \mathscr{L}}{\partial \phi^\mu} - \nabla_\alpha \frac{\partial \mathscr{L}}{\partial (\nabla_\alpha \phi^\mu)} = 0 $$ as expected.
1 This implicitly assumes that $\partial \mathcal{M}$ is not a null hypersurface. See Wald and/or Poisson for details on how such cases can be treated.
2 Alternately, if one wishes to consider a wider class of variations, one can instead modify the action by adding an explicit surface term which cancels this term. This turns out to sometimes be helpful if the Lagrangian contains higher-order derivatives. The best-known example of this is the Gibbons-Hawking-York boundary term, an addition to the Einstein-Hilbert action that has various nice properties that the "plain-vanilla" Einstein-Hilbert action does not.