In this post we will give an example of the Riemann zeta function at negative arguments that naturally comes up in physics. As it has been mentioned in the comment section, the expression for the Casimir force in $d$ spatial dimensions is related to $\zeta(-d)$; more precisely, the force per unit area is given by
\begin{equation}
\frac{F_\mathrm{Casimir}}{L^{d-1}}=-\frac{C(d)}{a^{d+1}} \tag{1}
\end{equation}
where $L^{d-1}$ is the hyper-area of the plates, and $a$ is the separation between them. Here, the constant $C(d)$ is given by
$$
\begin{aligned}
C(d)&=-2^{-d}\pi^{d/2} \Gamma\left(1-\frac{d}{2}\right)\color{red}{\zeta(-d)}\\
&=\frac{2^{-d}}{1+d}\ \xi(-d)
\end{aligned}\tag{2}
$$
(Remark: in the case of fields with spin, one has to multiply the expression above by the number of degrees of freedom; $2s+1$ for massive particles with spin $s$, and $2$ for non-scalar massless particle).
We will next outline the calculation of $F(d)$ in the zeta-regularisation scheme, though, as usual, one may prove that the result is scheme-independent (the proof is left to the reader).
We consider two plates of length $L$ (and hence, hyper-area $L^{d-1}$), separated by a distance $a$, such that the enclosed volume is $L^{d-1}a$. The free-field energy inside the plates, as given by the quantisation of a free massless scalar field, is
$$
E_0=\frac12 L^{d-1} a\int\frac{\mathrm d\boldsymbol p}{(2\pi)^d}\ \omega_{\boldsymbol p}+\text{const.}\tag{3}
$$
where $\boldsymbol p\in\mathbb R^d$, and $\omega_{\boldsymbol p}\equiv+|\boldsymbol p|$. Moreover, the $\text{const.}$ term is a counter-term which will play no role in zeta-regularisation (even though it is important in other schemes), because it happens to vanish. We will omit it in what follows.
If we take $a\ll L$ to be a finite distance, then the momentum along this direction is quantised. For example, imposing periodic boundary conditions,
$$
\boldsymbol p=(\boldsymbol p_\perp,p_n) \qquad \text{where}\qquad p_n=\frac{n\pi}{a}\tag{4}
$$
With this, the energy becomes
$$
E_0=\frac12 L^{d-1}a\frac{\color{blue}{I(d)}}{(2\pi)^d} \tag{5}
$$
where we have defined
$$
\color{blue}{I(d)}\equiv \frac{2\pi}{a}\int\mathrm d^{d-1}\boldsymbol p_\perp\sum_{n=1}^\infty\sqrt{\left(\frac{n\pi}{a}\right)^2+\boldsymbol p_\perp^2}\tag{6}
$$
The integral above has rotational symmetry, so we change into hyper-spherical coordinates:
$$
I(d)=\frac{4}{a}\frac{\pi^{\frac{d+1}{2}}}{\Gamma\left(\frac{d-1}{2}\right)}\int_0^\infty\mathrm dp_\perp\ p^{d-2}_\perp\sum_{n=1}^\infty\sqrt{\left(\frac{n\pi}{a}\right)^2+p_\perp^2}\tag{7}
$$
where
$$
\frac{2\pi^{\frac{d-1}{2}}}{\Gamma\left(\frac{d-1}{2}\right)}\tag{8}
$$
is the area of the $(d-2)$-sphere.
We next introduce a zeta-like regulator,
$$
\sqrt{\left(\frac{n\pi}{a}\right)^2+p_\perp^2}\to \left[\left(\frac{n\pi}{a}\right)^2+p_\perp^2\right]^{x/2}\tag{9}
$$
where the physical limit is $x\to 1$. (Remark: we should also introduce a mass scale to keep the units consistent; as usual, the mass scale only affects the counter-terms, so we will omit it for simplicity).
After introducing the regulator, we can explicitly evaluate the integral (which is just the beta function):
$$
\int_0^\infty \mathrm dp_\perp\ p_\perp^{d-2}\left[\left(\frac{n\pi}{a}\right)^2+ p^2_\perp\right]^{x/2}=\left(\frac{n\pi}{a}\right)^{d-1+x}\frac{\Gamma\left(\frac{d-1}{2}\right)\Gamma\left(\frac{1-d-x}{2}\right)}{2\Gamma\left(-\frac x2\right)}\tag{10}
$$
The integral above does only converge if $\text{re}(d+x)<1$, so we must regard this result as the continuation to complex $x$. Plugging this into $I$, we get
\begin{equation}
I(d)=\frac{2\pi^\frac{d+1}{2}}{a}\frac{\Gamma\left(\frac{1-d-x}{2}\right)}{\Gamma\left(-\frac x2\right)}\color{green}{\sum_{n=1}^\infty \left(\frac{n\pi}{a}\right)^{d-1+x}}\tag{11}
\end{equation}
where again, the sum is a zeta function:
\begin{equation}
\color{green}{\sum_{n=1}^\infty \left(\frac{n\pi}{a}\right)^{d-1+x}}=\pi^{d-1+x}a^{1-d-x}\zeta(1-d-x)\tag{12}
\end{equation}
and therefore
\begin{equation}
I(d)=-\frac{\pi^\frac{3d}{2}}{a}\Gamma\left(-\frac{d}{2}\right)a^{-d}\zeta(-d)\tag{13}
\end{equation}
where we have taken $x\to 1$.
After all this, the vacuum energy is
\begin{equation}
E_0=-2^{-(d+1)}\pi^{d/2} L^{d-1}\Gamma\left(-\frac{d}{2}\right)a^{-d}\zeta(-d)\tag{14}
\end{equation}
and so the force per unit area is
\begin{equation}
\frac{F}{L^{d-1}}=-2^{-(d+1)}\pi^{d/2} d\Gamma\left(-\frac{d}{2}\right)a^{-(d+1)}\color{red}{\zeta(-d)}\tag{15}
\end{equation}
which, after a trivial simplification, coincides with the expression for the Casimir force we claimed at the beginning of this post.
We leave it to the reader to try to generalise this into other geometries; for example, we could consider a discretisation of modes along several spatial dimensions, in which case one would presumably encounter a sum of the form
$$
\sum_{\boldsymbol n} (\boldsymbol n^2)^{x/2}\tag{16}
$$
whose properties in the complex-$x$ plane seem to be rather non-trivial. The interested reader can find some useful information in this Math.SE post. We will not pursue this any further.
There exist a large body of literature on the topic of divergent asymptotic series. This article gives an overview of the theory from a practical point of view. This article focuses on methods that can be applied to asymptotic series of which all the terms are known, or at least the coefficients of the late terms are known to some leading approximation. In QFT one typically has just a few terms of a perturbation expansion, but even in that case one can apply certain mathematical methods to resum the series.
So, a typical problem is then that given a perturbative expansion of some function $f(g)$ in powers of some coupling $g$ we want to know the behavior of $f(g)$ for large $g$, but for large $g$ the series starts to diverge really fast, and we only have a few terms. This is not as hopeless as it looks, let's consider the following example. The logarithm of the factorial function $\log(n!)$ has the following asymptotic expansion for large $n$:
$$\log(n!) = n\log(n) - n + \frac{1}{2}\log(2\pi n) + \frac{1}{12 n} - \frac{1}{360 n^3} + \mathcal{O(n^{-3})}$$
Suppose that these are all the terms that we know about. We want to extract the behavior of the factorial function near $n = 0$ using only the given terms. We obviously cannot set $n = 0$, the individual terms will already diverge. But what we can do is to extrapolate to $n = 0$ using only larger values for $n$ where the series does make sense. E.g. there is no problem with inserting even not so large values like $n = 1$ in the series, moreover you can put $n = 1+\epsilon$, expand in powers of $\epsilon$ and then set $\epsilon = -1$ to jump to $n = 0$. The validity of such methods then depend on assuming that the function you are dealing with is analytic in a neighborhood of $n = 0$, and the fact that you have to use a divergent series around infinity to get there doesn't make the approximations invalid.
A more sophisticated way to get to the behavior near $n = 0$ is to apply a conformal transform to the expansion parameter. If we put:
$$n = \frac{1-z}{p z}$$
and expand in powers of $z$, we get:
$$
\begin{split}
f(z) =& \frac{1}{2}\log\left(\frac{2\pi}{p}\right)+\frac{\log(p)}{p} +\frac{1-\log(p)}{p z}+\left(\frac{1}{p}-\frac{1}{2}\right)\log(z) -\frac{\log(z)}{p z} +\\&\left(\frac{p}{12}+\frac{1}{2 p}-\frac{1}{2}\right)z + \left(\frac{p}{12}+\frac{1}{6 p}-\frac{1}{4}\right)z^2 + \left(-\frac{p^3}{360}+\frac{p}{12}+\frac{1}{12 p}-\frac{1}{6}\right) z^3+\mathcal{O}(z^4)
\end{split}
$$
There is now no problem with inserting $z=1$ in the series which corresponds to $n = 0$. The result will depend on the auxiliary parameter $p$, the optimal choice of this parameter is to choose it in such a way that the series has the best convergence. A good choice is obtained by setting the coefficient of the last term equal to zero, and then evaluating the coefficient of the coefficient of the previous term to see which solution makes that coefficient the smallest. This yields good results because typically the error is of the order of the last omitted term. Another interpretation is that the answer doesn't depend on $p$, but now you do have a $p$ dependence due to only having a finite number of terms. If you then make the last terms as small as possible, you're going to get a result that for other values of $p$ would have had to come from higher order terms. It's also worthwhile to consider the earlier terms to see if a particular value yields a more robust series.
In this case you find that $p = 0.863778859012849315368798819337$ looks to be the best value to use, it would be the the second choice when considering the value of the coefficient of $z^2$, but the other solution that makes this coefficient the smallest yields a series with a coefficient that is slightly increasing before becoming zero for the $z^3$ term.
So, if we then put $p = 0.863778859012849315368798819337$, we can put $z = 1$ to estimate $0!$, but we can do more. We can expand our function of $z$ around $z = 1$, by putting $z = 1-u$ and expanding in powers of $u$. If we then also expand $n$ in powers of $u$, we can invert the series to express $u$ in powers of $n$, so we then get to a series of $\log(n!)$ in powers of $n$. The result is:
$$
\log(n!)= 0.0002197 - 0.577755 n + 0.823213 n^2 - 0.401347 n^3 +\cdots
$$
The exact expansion with the coefficients to 8 significant figures is:
$$\log(n!) = -0.57721566 n + 0.82246703 n^2-0.40068563 n^3+\cdots$$
Clearly, a great deal of information can be extracted from divergent series, even if you just have a few terms. Mathematical rigor should be used to your advantage to extract as much information as possible, you should not be intimidated by mathematical rigor pointing to roadblocks.
Best Answer
You are conflating three conceptually different categories of "regularizations" of seemingly divergent series (and integrals).
The type of resummations that Hardy would talk about are similar to the zeta-function regularization - the example that is most familiar to the physicists. For example, $$S=\sum_{n=1}^\infty n= -\frac{1}{12}$$ is the most famous sum. Note that this result is unique; it is a well-defined number. In particular, that allows one to calculate the critical dimension of bosonic string theory from $(D-2)S/2+1=0$ and the result is $D=26$. Fundamentally speaking, there is no real divergence in the sum. The "divergent pieces" may be subtracted "completely".
However, in the usual cases of renormalization - of a loop diagram - in quantum field theory, there are divergences. Renormalization removes the "infinite part" of these terms. A finite term is left but the magnitude of the term is not uniquely determined, like it was in the case of the sum of positive integers. Instead, every type of a divergence in a loop diagram produces one parameter - analogous to the coupling constant - that has to be adjusted. Because the finite results can be "anything", this is clearly something else than the zeta-regularization and, more generally, Hardy's procedures whose very goal was to produce unique, well-defined results for seemingly divergent expressions. Infinitesimally speaking, the Renormalization Group only mixes the lower-order contributions (by the number of loops) into a higher-order contribution.
So these are two different things that one should distinguish.
There is another category of problems that is different from both categories above: the summation of the perturbative expansions to all orders. It can be demonstrated that in almost all fields theories - and perturbative string theories as well - the perturbative expansions diverge. For a small coupling, one can sum them up to the smallest term, before the factorial-like coefficient begins to increase the terms again, despite the $g^{2L}$ suppression. The smallest term is of the same order as the leading non-perturbative contributions.
At the very end, if the theory can be non-perturbatively well-defined - and both QCD-like theories and string theory can, at least in principle - the full function as a function of the coupling constant $g$ exists. But it just can't be fully obtained from the perturbative expansion. The Renormalization Group won't really help you because it only mixes the perturbative terms of another order to a perturbative diagram you want to calculate. If you don't know the non-perturbative physics, the equations of the Renormalization Group won't fill the gap because they will keep you in the perturbative realm.
So I have sketched three different things: in the Hardy/zeta problems, the answer to the divergent series was unique; in the particular $L$-loop diagrams in QFT, it wasn't unique but the infinite part was subtracted and the finite part was obtained by a comparison with the experiments; and in the perturbative expansion resummed to all orders, the sum actually didn't converge and indeed, it didn't know about all the information about the full result for a finite $g$.
The last statement may have some subtleties; at least for some theories, the non-perturbative physics is fully determined by the perturbative physics. But I think it is not quite general and we have counterexamples - e.g. for AdS/CFT with orthogonal groups and different discrete values of $B$ etc. So it means that the perturbative expansion doesn't uniquely determine the theory non-perturbatively.
Because the three examples differ at the level of "what can be calculated" and "what cannot", they are different.