In any theory which includes General Relativity, there is no locally conserved energy. The reason is that energy creates a gravitational field which has energy itself, so "gravity gravitates". There is a local quantity (the energy-momentum tensor) which is covariantly conserved, and there are global quantities (like the ADM mass) which expressed the total energy of the system and are conserved. But there are no currents which give locally conserved quantities.
(One more technical way to express that is that the spacetime transformations corresponding to local energy and momentum conservation are now no longer global symmetries but are gauge redundancies).
In field theory (classical or quantum) without gravity spacetime translations can be a global symmetry (if spacetime is flat), and correspondingly energy is locally conserved. Once you couple the theory weakly to gravity (even if gravity is only a background with no dynamics of its own), it is then only approximately conserved. When gravitational effects are large, there is no approximately conserved quantity correpsonding to energy.
As Lubos said in comments, nothing really new happens with respect to this question in string theory. In the most general situation there is no locally conserved quantity, and when the theory reduces to QFT on spacetime which is approximately flat and when gravitational effects are small, then there is an approximately conserved quantity. String theory is compatible with both QFT and GR, which means it reproduces their results approximately in the appropriate limit. But of course away from those limits it has its own features that are generally different from either one of those theories. For that specific question it is much closer in spirit to GR.
(I answered part of a related question you asked elsewhere. Here is the rest)
Firstly, I unfortunately disagree with many of your comments above and the conclusions you draw from them.
I find your classification of the three types confusing. I propose to relabel the categories as follows for the sake of my comments below:
Category A. Pertubatively renormalisable: All divergences can be absorbed by renormalising a finite number of parameters of the theory while still maintaining all the desired symmetries.
Category B. Power Counting renormalisable: The superficial degree of divergence of graphs
is investigated to see if the divergence structure is likely to be manageable. If the couplings are all of positive mass dimension then there is hope that the theory is manageable and also that it might fall under category A (but that requires detailed proof).
Category C. Wilson renormalisation. Here one uses a cutoff, and studies the RG flow of the theory. Effective low energy theories are of interest in condensed matter and also in high energy physics.
Now some comments which address your questions, and also comments on some of you comments that I disagree with:
a. QED has a Landau pole in perturbation theory. No one knows what happens to it non-perturbatively. Indeed long before you reach the Landau pole, the approximation you used to get that pole breaks down.
b. The Standard Model of particle physics, which is still the best experimentally tested theory of the strong, weak and electromagnetic interactions, falls under categories A and B. It can also be studied under category C if you wish, and people have done that to get low energy effective theories for particular applications (see below).
c. I do not know what "nonpertubatively renormalisable" could mean. Any expansion of the quantum theory will be perturbative in some parameter or another. If not the coupling "g", the large N etc. (Unless maybe if you put it on a lattice and study it there...i am not sure even then).
d. The physical meaning of Category A and why its a big deal:
If a theory is in category A then it is in some sense self-contained. After fixing a finite number a parameters you can answer many more questions to arbitrary degree of accuracy. Eg the anomalous magnetic moment of the electron agrees with experiment to 10 significant figures. Many many loops calculated by Kinoshita and company over decades.
It has been reported as the best verified prediction in the history of physics. So we must be doing something right here. I think we need to pause a few seconds to appreciate this achievement of mankind.
To continue. Historically, the Category A requirement was one of the crucial guiding principle in constructing the Standard Model (see Nobel acceptance speech by Weinberg). The theory that was constructed predicted many new particles that had not yet been observed. Many Nobel prizes were awarded based on the predictions that came true.
It is always possible to cook up some theory to explain known facts, but to predict something and to find it come true, time and again, is exceptional.
So, a self-contained predictive structure that agrees with Nature. That is what category A is about. That is why it is in the text books.
Physically, it means that Category A theories are not sensitive to the unknown physics at the arbitrarily large cut off you have chosen (taken to infinity eventually) and whose ignorance is partially absorbed in the renormalisation parameters. In this sense category A theories are not different physically from Category C investigations. Its just that in Category A people tried to push the cutoff to infinity and they succeeded.
But the enterprise could have failed badly, then it would have been forgotten and we would have been trying something else (by the way, people were trying something else in the old days, like S-matrix approach and bootstrap when they were groping in the dark...Google it).
e. The fact that it worked tells us that any new physics beyond the Standard Model is at a much higher energy than what we had explored previously. The LHC is trying now to find the potential new physics. Which leads me to the next point.
f. The Standard Model is probably not the end of the story. In fact with the modern perspective that all theories are in some sense effective theories, even the standard model, you can now add non-renormalisable terms to account for potential high energy physics that we have not yet seen. These extra terms you add are constrained by the success of what we have already seen. Weinberg's book and many other places discuss this.
So far no new physics beyond that predicted by the Standard model has been seen at LHC (though some people think that the smallness of the likely neutrino masses might be a hint of something beyond the horizon).
g. The Wilsonian perspective, that all theories are effective theories is wonderful. After all, we cant claim to know what is beyond what we have seen so far. His approach was immensely successful in condensed matter physics, and as I mentioned in points above, it is also adopted by many particle physicists.
But instead of "integrating our higher degrees of freedom" which is difficult technically (the top down approach) and in practice (what is your top theory? string theory? something else?), most people start at the bottom (renormalisable theory) and add non-renormalisable terms as i mentioned above.
In summary:
Category A is the crowning glory of particle physicists. It still provides guidance on the construction of extensions of particle physics theories.
Category C is the modern perspective on what theories are. But as i said above, it doesn't conflict with Category A which was an ambitious programme that somehow succeeded.
There are some language differences between those who are in the Category A camp and those who are in the Category C camp, but I believe its simply a matter of history and convenience.
I recommend the book by Zinn Justin which I believe covers all the Categories: A, B and C. Its a 1000 odd pages, with all details worked out, though the presentation is a bit terse. (I have not read it but flipped through it many years ago). The author is a renowned practitioner in the field of renormalisation with many original contributions.
Best Answer
The problem is that the perturbation series, even in the best behaved theories, is not a sufficient criteria for reconstructing the theory. In the case of QCD, you can reconstruct the non-perturbative theory by defining a path integral on a lattice, and taking the limit of a fine lattice with the coupling logarithmically going to zero as the lattice spacing gets smaller, and this makes a consistent continuum limit which defines the non-perturbative path integral. This definition is computational and absolute--- it gives you an algorithm to compute all correlation functions in the theory.
For quantum gravity, you can start with a flat metric and do a perturbation series, and get the graviton interactions. But there is no reason to believe that there is a non-perturbative theory you are approximating when you do this. The path integral for quantum gravity is not lattice regularized very well, because the lattice spacing is dynamical--- you have a metric that tells you what the actual distance between lattice points is. When you take the limit of small lattice distance, there is no guarantee that you have a well defined quantity.
Further, the path integral might include sums over non-equivalent topologies. You could imagine a handle popping out of space time and disappearing later. If this is so, and if the sum is over arbitrarily small space-time structure, then there is a serious problem, because high dimensional topologies are known to be non-classifiable, so that it is impossible to give an algorithm which will sum over each topology once and only once. You can given an algorithm on simplices which will sum over all topologies in a redundant way, by summing over all possible gluing of the simplicies. But if you think the continuum object is well defined, then it seems that the simplex sum should reproduce the sum over topologies, which is a non-computable thing. This suggested to Penrose that the full theory of quantum gravity is capable of hyper-computation (stronger than Turing computation), but I personally am sure the concept of hyper-computation of this type is with scientific certainty an incoherent concept in a logical sense, since the logical properties of hypercomputation cannot be described in any finite way using axioms, even allowing the axiom system to increase in complexity with time.
Even if you just look at the perturbation series, and try to make sense of this, there is a serious problem when the scattering of particles is Planckian or above. If you hit two particles at the Planck energies or more, you should produce an intermediate black hole state, and the sum over intermediate states should then be over the number of degrees of freedom of this intermediate black hole. But a black hole of radius R only has R^2 worth of degrees of freedom, while the volume degrees of freedom are R^3. So the scaling laws of the perturbation theory particles for the maximum amount of information in a given region, one which can contain a black hole, is not consistent with gravitational holography.
Transitioning to an S-matrix picture resolves all these problems, because it gives string theory. In string theory, the perturbation series is on S-matrix particle states, not on field states, so that the intermediate states are not localized at individual space-time points. The sum over intermediate states reproduces an extended object fluctuations, whose degree-of-freedom count is holographically consistent. The algebra of external operators is by insertions on the string world sheet (or on a brane world-volume theory), and the number of degrees of freedom on the classical limit of large branes or black holes has the correct holographic scaling. This is not a surprise for gravity, but it is not possible with a naive field theory, because the field theory has many more degrees of freedom at short distances.
t'Hooft's argument
The essence of the very wordy argument above can be explained in a short calculation by t'Hooft. He asked, given a Schwartschild horizon, what is the entropy that you can store in the fields just outside the horizon. You have a fixed energy, and you assume the black hole is enormous, and you ask, how many different microstates can you fit in the region R>2M.
The answer is easily seen to be divergent. At energy E, the redshift factor introduces a factor of $\sqrt{r-2M}$ (near horizon approximation), which shifts energies to the red region. If you fix the total energy, the number of modes of energy less than E in a volume V has a field theory scaling law determined by doing Fourier transforms in a box, and this scaling law gives VE^4 (it's the same as the vacuum energy divergence, since it is counting all the modes once and only once). Because the energy redshifts, you get a divergent integral when you look outside the black hole horizon, so that the number of states of energy less than E near a black hole horizon is divergent in any field theory.
The resolution for this paradox is to adopt an S-matrix picture for black holes, and renounce most of these degrees of freedom as unphysical. This means that the space-time around a black hole is only a reconstruction from the much smaller number of degrees of freedom of the black hole itself. This is the origin of the principle of holography, and the principle is correct for string perturbation theory.
Within loop quantum gravity, the regulator is completely different, and might not be consistent, I am not sure, because I do not understand it well enough. The regulated theory should reproduce an S-matrix type thing when it has an S-matrix, but such states are not known in the loop gravity. The knot representation, however, makes loops and cuts down the field theoretic degrees of freedom in a way that is reminiscent of holography, so it is isn't ruled out automatically.
But just doing a path integral over spacetime fields when the spacetime includes black holes is plain impossible. Not because of renormalizability (you are right that this is not an issue--- it would be fixed by an ultraviolet fixed point, or ultraviolet safety in Weinberg's terminology) but because the number of degrees of freedom in the exterior of a black hole is too large to be physical, leading to a divergent additive constant in the black hole entropy which is physically ridiculous. A quantum field theory of gravity would, if it were consistent, have to be a remnant theory, and this is physically preposterous.
I am sorry that the above sounds more hand-waving than it is, this is more a limitation of my exposition style than the content. The papers of t'Hooft are from 85-93 era in Nuclear Physics B, and the papers of Susskind on holography in string theory are well known classics.