Theories like QED, where one has a finite number of relevant operators are very rare. Many important predictions are performed by means of effective theories (Please see the following reviews by: Aneesh V. Manohar and Scherer and Schindler). It is sometimes said that that since these theories are nonrenormalizable, then any loop correction beyond the tree level is not useful because we can actually have an infinite numbers of free parameters that can fit the theory to any data that we have.
However, Consider for example chiral perturbation theory which describes the low energy degrees of freedom of QCD (with 3 flavors):
$$ \mathcal{L} = \frac{1}{4} f_{\pi}^2\partial_{\mu}U^{\dagger}\partial_{\mu}U + ...$$
($U \in SU(3)$ is generated by the meson fields). Here, one loop corrections give rise to 8 counterterms whose coefficients can be estimated from various processes. There is evidence for improvement in the precision with respect to the tree level computation.
The improved predictions can be explained as follows: Even though the theory is not renormalizable in the usual sense, but if we restrict the
Lagrangian to terms with less than a given number of derivatives, and a given number of loops, then, there is a finite number of counterterms. The example mentioned above corresponds to terms with up to 4 derivatives. The same terms of order 4 also serve as counterterms needed to renormalize the one loop contribution of the terms with up to 2 derivatives. Thus, if we limit ourselves low energy processes, we need only a finite number of counterterms. In other words up to a given energy scale, we have control on the counterterms
In the example of chiral perturbation theory, we know that it is a low energy effective theory, thus we know that we should stop at some level of the number of derivatives (or momenta).
This procedure is known as approximate renormalizability, in contrast
to the "exact" renormalizability present in QED, where any energy scale can be reached. Actually, this exact renormalizability is not of much practical use, since QED itself is not valid to very high energies (other interactions become important).
Thus given that we want to work up to some energy scale, we can treat the effective field theory as a renormalizable theory and perform loop expansions which generate counterterms with no higher scale.
The question is how can we know where to stop. The answer lies in our knowledge of the degrees of freedom outside the theory. For example, the Fermi theory of weak interactions (which includes an effective four fermion term) gives good predictions for beta decays up to energies of the order of magnitude of the mass of the $W$-boson which is integrated out in the Fermi theory.
This "relaxation" of the renormalizability requirements does not mean
that we have an infinite number of effective theories at our disposal.
Extra structures are needed to create the property of approximate renormalizability. For example, chiral perturbation theory stems from the origin of pions as the Goldstone bosons of the chiral symmetry breaking.
A major factor which can spoil the approximate renormalizability are anomalies. If we try to gauge an anomalous symmetry, then we loose control on the number of derivatives in the counterterms. Moreover, if we gauge only anomaly free subgroups, then the counterterms will be gauge invariant up to total derivatives in the Lagrangian, and we have Ward identities to each scale.
Like you said, we can include gravity perturbatively in the framework of low-energy effective QFT, as reviewed in reference 1. This works because gravity is extremely weak at the energies that characterize modern particle-physics experiments. But the interest in quantum gravity revolves around nonperturbative/high-energy/strong-field issues, like the holographic principle and the informaion-loss paradox, both of which were already known in the 1970s (references 2,3,4) and were surely on Distler's mind in 1982.
Thanks to universality, very different theories can become indistinguishable from each other at sufficiently low resolution. Low-energy experiments can only fix the first several terms in the lagrangian on which perturbation theory is based. That's what allows us to include gravity in the Standard Model in the sense of low-energy effective theory (reference 1), and I'm guessing this was also the basis for Georgi's assertion. Terms of higher order in the cutoff are not resolved, so we cannot attack the interesting questions about quantum gravity — which are nonperturbative/high-energy/strong-field — by extrapolating upward from the low-energy effective theory.
Even if it was fair at the time, Georgi's "waste of time" judgement is obsolete now, because now we have approaches to studying quantum gravity that don't rely on extrapolating upward from a low-energy effective theory. Perturbative string theory is tightly constrained by numerous anomaly cancellation requirements, which are nonperturbative. Fully nonperturbative formulations like AdS/CFT are also available. (See references 5 and 6 for perspectives about the situation in the more realistic case of asymptotically de Sitter spacetime, which is not understood as well yet.) In hindsight, Georgi/Distler's statement
...there’s no decoupling regime in which quantum “pure gravity” effects are important, while other particle interactions can be neglected
seems to be true in an even stronger sense in string theory. Here's an excerpt from section 2.2 in reference 7:
Typically, the mass scale associated to [quantum gravity] physics is [the Planck mass] $M_p$, and one might expect that working at energy scales far below the Planck mass would mean that we lose sensitivity to such physics. But the conjecture says that if in the bulk of moduli space... the tower of states has a mass scale around the Planck mass $M_p$ ..., then at large field expectation values this mass scale is exponentially lower than $M_p$. Therefore, it claims that the naive application of decoupling in effective quantum field theory breaks down at an exponentially lower energy scale than expected whenever a field develops a large expectation value.
Whether this "stringy" phenomenon is our enemy or our friend, it at least corroborates the idea that the interesting questions about quantum gravity are not things we can study properly by decoupling it from everything else.
Donoghue (1995), Introduction to the Effective Field Theory Description of Gravity (https://arxiv.org/abs/gr-qc/9512024)
Bekenstein (1973), Black holes and entropy, Physical Review D 7, 2333-2346
Hawking (1975), Particle creation by black holes (https://projecteuclid.org/euclid.cmp/1103899181)
Hawking (1976), Breakdown of predictability in gravitational collapse, Phys. Rev. D 14, 2460–2473
Witten (2001), Quantum Gravity In De Sitter Space (https://arxiv.org/abs/hep-th/0106109)
Banks (2010), Supersymmetry Breaking and the Cosmological Constant (https://arxiv.org/abs/1402.0828)
Palti (2019), The Swampland: Introduction and Review (https://arxiv.org/abs/1903.06239)
Best Answer
You suggest that we can use a nonrenormalizible theory (NR) at energies greater than the cutoff, by meausuring sufficiently many coefficients at any energy.
However, a general expansion of an amplitude for a NR that breaks down at a scale $M$ reads $$ A(E) = A^0(E) \sum c_n \left (\frac{E}{M}\right)^n $$ I assumed that the amplitude was characterized by a single energy scale $E $. Thus at any energy $E\ge M$, we cannot calculate amplitudes from a finite subset of the unknown coefficients.
On the other hand, we could have an infinite stack of (NR) effective theories (EFTs). The new fields introduced in each EFT could successively raise the cutoff. In practice, however, this is nothing other than discovering new physics at higher energies and describing it with QFT. That's what we've been doing at colliders for decades.