In General Relativity Einstein's equation implies that stress-energy tensor on its RHS is conserved (has vanishing divergence), due to the Bianchi identity. Considering variational principles leading to Einstein's equation leads to conclusion that this stress tensor is equal to the variational derivative of full action with respect to the metric tensor. However, on several occasions I heard people stating that quite generally one can define stress tensor for a field theory in this way and it is automatically conserved. In flat spacetime and without any coupling to gravity! I wonder if this is true. I don't see a reason why it should be.
General Relativity – Conservation of Stress-Energy Tensor as Variation of Action with Respect to Metric
conservation-lawsfield-theorygeneral-relativitylagrangian-formalismstress-energy-momentum-tensor
Related Solutions
I do think Jerry Schirmer answered the question in the comments, but I'll try to expand just to make clear how he explained everything.
Let us consider given that special relativity is correctly described by physics in Minkowski spacetime. Then we can ask ourselves how to include gravity without violating causality, which is mandatory by the finite velocity of light.
The idea is to consider Einstein's elevator. Namely that there is no local experiment which can be done that can differentiate between bodies in free fall in a constant gravitational field and the same bodies uniformly accelerated. That's because gravity affects everything the same way. A somewhat formalization of this is called Einstein's equivalence principle (in contrast with Galileo's, that say about coordinate transformation by constant velocities).
Note first that this is not the case for eletromagnetism. One can always use test charges to determine the electromagnetic fields, and it is impossible to do away with them using accelerated frames. Also, the equivalence principles is strictly local. If you look at extend regions gravity will appear through tidal forces.
So, if you think that special relativity is a particular case of general relativity (because it's just the same without gravity) the question is: what looks locally like special relativity but not globally? The answer is curved lorentzian manifolds, that locally are Minkowski.
But, as Jerry stressed, if you think in curved manifolds as generalization of flat ones, that does not, in principle, say anything about gravity. Only by noticing it is a force unlike any other, and formalizing it through the equivalence principle, one can justify the physics behind it, that is the use of curved manifolds. For instance, you suggest it is natural to generalize the situation by allowing curved spaces, but from the mathematical point of view one could just as well argue that there are other forms of generalization, e.g. we could instead try to projectify Minkowski. This is indeed usefull in other contexts, but it has nothing to do with gravity. So for a physicist is important we have "conceptual insights" to guide the process of "generalization for comprehension", or in other words we need principles with physical content.
I'm really unsure about what Gauss could be thinking regarding the metric. He did try to formulate classical mechanics in a differential geometrical way (Lanczos "Vartiational principles of classical mechanics" discusses it), but if that's what you're referring to, then it had nothing to do specifically with gravity.
EDIT: Oh boy, that last sentence is very misleading, I'm sorry. I had a look at Lanczos' book and realized that while Gauss pushed for a different formulation of classical mechanics, it's called Principle of Least Constraint, page 106 in Lanczos, it was only after some time that Hertz gave the principle the geometrical interpretation. So really not relevant to you question. I won't erase the paragraph though, in case anyone is interested.
Also, the equivalence principle argument says nothing about the field equations, and would be true even if the correct equations were different. As a matter of fact, a lot of general relativity independs of Einstein Field Equations, like the causal structure and (to some extend) the singularity theorems. This is why the equivalence principle was formulated as early as 1907 but the field equations came only in 1915.
I'm not a big fan of "what if" questions in history, majorly because they don't seem to have answers, but while Poincaré had the Lorentz trasnformations and a lot of understanding of special relativity, I never heard of anyone who anticipated the equivalence principle. So I hope this makes plausible that while others could have done SR, it did not seem likely that GR was coming, because first it was needed to understand what gravity is. Nordstrom's theory is an extension of ideas of eletromagnetism and was bound to failure. Hilbert indeed got the field equations right on his own, but would not get there without the motivation of curved spacetimes
Best Answer
Actually, the metric variational definition for the stress-energy tensor (due to Hilbert, as remarked by Qmechanic) is an universal improvement procedure for the canonical stress-energy tensor (and hence not always concides with the latter), in a sense which will be made precise below. Such a procedure is necessary because the canonical stress-energy tensor, although always conserved, often fails to satisfy other physical requirements like gauge invariance (since it is an observable quantity), symmetry (needed if we want it to be a source for the gravitational field) and tracelessness (for locally scale invariant theories). For example, all three requirements fail for pure electrodynamics in four space-time dimensions.
Even if you are dealing with a field theory in Minkowski space-time, it is inevitably coupled to gravity simply because of the fact that the Lagrangian depends on the space-time metric (here taking the particular value of the Minkowski metric). The particular dynamics of the metric is irrelevant - all we need is that there are no other "external" fields besides the metric and that the field action functional is diffeomorphism invariant.
Let $L=L(\phi,g)$ be a local field Lagrangian in the space-time $(M,g)$, and $$S_K[\phi,g]=\int_K L(\phi,g)\sqrt{|\det g|}\mathrm{d}x\ ,\quad K\subset M\text{ any bounded region}$$ the corresponding (family of) action functional(s indexed by $K$ as above). We allow $L$ to have finite but otherwise arbitrary order dependence on $\phi$ and $g$, and no explicit space-time dependence since we want it not to depend on any other fields. The infinitesimal variation of $S_K$ with respect to a vector field $X$ on $M$ (i.e. an infinitesimal diffeomorphism) is then given by $$\delta_X S_K[\phi,g]=\int_K\left(\frac{\delta L(\phi,g)}{\delta g_{\mu\nu}}\delta_X g_{\mu\nu}+\frac{\delta L(\phi,g)}{\delta \phi^j}\delta_X \phi^j+\nabla_\mu(T^{\mu\nu}X_\nu)\right)\sqrt{|\det g|}\mathrm{d}x\ ,\quad X_\rho=g_{\rho\sigma}X^\sigma\ ,$$ where $\frac{\delta L(\phi,g)}{\delta g_{\mu\nu}}$ and $\frac{\delta L(\phi,g)}{\delta \phi^j}$ are respectively the Euler-Lagrange (i.e. variational) derivatives of $L(\phi,g)$ with respect to $g$ and $\phi$, $\nabla$ is the Levi-Civita covariant derivative associated to $g$, $T^{\mu\nu}$ is the (canonical or improved) stress-energy tensor, $$\delta_X g_{\mu\nu}=\nabla_\mu X_\nu+\nabla_\nu X_\mu$$ is the Lie derivative of $g$ along $X$ and the infinitesimal field variation $\delta_X\phi^j$ depends on the particular way we lift $X$ to a projectable vector field on the total space of the fiber bundle over $M$ where the fields $\phi^j$ live (for instance, if they are all scalar fields, we simply have $\delta_X\phi^j=-X\phi^j=-X^\mu\nabla_\mu\phi^j$).
There is an implicit but crucial requirement on the admissible improvements for $T^{\mu\nu}$ - namely, the improved Noether current $j^\mu(L,X)=T^{\mu\nu}X_\nu$ associated with the would-be symmetry $X$ of the action functional should not only be linear in $X$ but depend only on the point values of $X$ (we call this property ultralocality) - therefore, we wrote it already as a tensor contraction. This requirement also affects to a certain extent the definition of $\delta_X\phi^j$, but the details of this are not important in what follows. Why do we insist on this requirement? As we shall see below, ultralocality singles out a unique improvement prescription for $T^{\mu\nu}$ which in addition satisfies all physical desiderata. This idea applies more generally to any local symmetry - for instance, it may be used to improve the canonical Noether current associated with local gauge symmetries.
Diffeomorphism invariance of the action functional means we require that $\delta_X S_K[\phi,g]=0$ for all $X,\phi,g,K$. If, in addition, the fields $\phi^j$ satisfy the Euler-Lagrange equations of motion, we have that $$2\frac{\delta L(\phi,g)}{\delta g_{\mu\nu}}\nabla_\mu X_\nu+\nabla_\mu(T^{\mu\nu}X_\nu)=\left(2\frac{\delta L(\phi,g)}{\delta g_{\mu\nu}}+T^{\mu\nu}\right)\nabla_\mu X_\nu+X_\nu\nabla_\mu T^{\mu\nu}=0\ .$$ The first identity seems trivial but in fact follows from ultralocality of the improved Noether current, as explained above. Since $X$ is arbitrary and therefore we may specify $X_\nu$ and $\nabla_\mu X_\nu$ independently at each point of $M$, we obtain in a single stroke:
The desired variational formula for the improved stress-energy tensor $$T^{\mu\nu}=-2\frac{\delta L(\phi,g)}{\delta g_{\mu\nu}}$$ and therefore the symmetry $T^{\mu\nu}=T^{\nu\mu}$;
The covariant conservation law $\nabla_\mu T^{\mu\nu}=0$;
If the metric happens to obey a dynamics determined by a Lagrangian $L_G(g)$, then $T^{\mu\nu}$ automatically becomes the source to the metric equations of motion. This also guarantees compliance with the second Noether theorem, as it should - the canonical Noether current associated to the total (i.e. metric + field) Lagrangian and to $X$ still vanishes on shell if the total action functional is also diffeomorphism invariant.
Although it is not trivial to show, $T^{\mu\nu}$ also happens to be traceless if the field theory exhibits local scale invariance.
If the fields $\phi^j$ are all scalar and $L(\phi,g)$ is a Lagrangian of first order in $\phi$ with a Klein-Gordon-like kinetic part and not depending on derivatives of $g$, then $T^{\mu\nu}$ coincides with the canonical stress-energy tensor. This is no longer the case for spinor fields, whose Lagrangian usually also depends on the first derivatives of the metric through the spin connection, for scalar fields with non-minimal curvature coupling, or for the electromagnetic field.
The above understanding of the metric variational definition of the stress-energy tensor in full generality came surprisingly late - it was thoroughly developed by M. Forger and H. Römer ("Currents and the Energy-Momentum Tensor in Classical Field Theory: a Fresh Look at an Old Problem". Ann.Phys. 309 (2004) 306-389, arXiv:hep-th/0307199), whose work we warmly recommend for (many) more details and examples.