First: Whether the metastable region is acceptable is somewhat debatable. I think that most experts would say No. Even if the tunneling may be very slow, one would have to explain why the Universe started in a configuration whose energy is very far from the minimum, in a metastable valley.
There are other papers that already put the observed mass to the strictly unstable realm, too.
But there's another question here:
How can we start with consistent model at lower scales and arrive at inconsistent model at higher scales?
This question contains some incorrect assumption/information. In reality, we – more precisely Nature – doesn't start at low scales. Nature always starts at the high scales. That's where the fundamental laws of Nature have their well-defined form. And we may derive the effective laws valid at low energies or long distances. We end up with low scales; we don't start with them!
If you're thinking that you're "starting" with a low-scale theory, you're really solving an inverse problem: you're looking for a theory valid up to higher scales or all scales – a more universally usable, more fundamental theory – which happens to reduce to a given low-energy effective theory. The fact that you may derive an inconsistent (or no) high-energy theory simply means that your inverse problem has no solution. The low-energy effective theory may look consistent but it's just an artifact of the approximations. When one looks properly, it is inconsistent. The extrapolation to higher energies is a systematic way to see this fact.
I accidentally wrote about the stability argument yesterday here:
http://motls.blogspot.com/2012/07/why-125-gev-higgs-boson-isnt-quite.html?m=1
The main problem lies in the "large logarithms". Indeed, suppose you want to calculate some quantity in Quantum Field Theory, for instance a Green Function. In perturbation theory this is something like:
$$\tilde{G}(p_1,...,p_n)=\sum_k g^k F_k(p_1,...,p_n)$$
for some generic functions $F$ and $g$ is the coupling constant. It's not enough to require a small $g$. You need small $g$ AND small $F$, for every value of the momenta $p$ (so for every value of the energy scale of your system).
A nice little calculation to understand this point. It's obvious that:
$$\int_0^\infty \frac{dx}{x+a}=[log(x+a)]_0^\infty=\infty$$
Let's use a cutoff:
$$\int_0^\Lambda \frac{dx}{x+a}=log\frac{(\Lambda+a)}{a}$$
This is still infinite if the (unphysical) cutoff is removed. The whole point of renormalization is to show that a finite limit exist (this is "Fourier-dual" to send the discretization interval of the theory to zero). This quantity is finite:
$$\int_0^\Lambda \frac{dx}{x+a}-\int_0^\Lambda \frac{dx}{x+b} \rightarrow log\frac{b}{a}$$
But if $a \rightarrow \infty $ the infinite strikes back!
So for a generic quantity F(p) regularized to F(p)-F(0) we want at least two things: that the coupling is small at that momentum $p$ and that $p$ is not far away from zero. But zero is arbitrary, we can choose an arbitrary (subtraction) scale. So we can vary this arbitrary scale $\mu$ in such a way that it is always near the energy scale we are probing.
Is convenient to take this scale $\mu$ at the same value of the renormalization scale. This is the energy at which you take some finiteness conditions (usually two conditions on the two point Green function and one condition on the 4 point one). The finiteness conditions are real physical measures at an arbitrary energy scale, so they fix the universe in which you live. If you change $\mu$ and you don't change mass, charge, ecc. you are changing universe. The meaning of renormalization group equations is to span the different subtraction points of the theory, remaining in your universe. And of course every physical quantity is independent of these arbitrary scale.
EDIT:
Some extra motivations for the running couplings and renormalization group equations, directly for Schwartz:
The continuum RG is an extremely practical tool for getting partial results for high- order loops from low-order loops. [...]
Recall [...] that the difference between the momentum-space Coulomb potential V (t) at two scales, t1 and t2 , was proportional to [...]
ln t1 for t1 ≪ t2. The RG is able to reproduce this logarithm, and similar logarithms of physical quantities. Moreover, the solution to the RG equation is equivalent to summing series of logarithms to all orders in perturbation theory. With these all-orders results, qualitatively important aspects of field theory can be understood quantitatively. Two of the most important examples are the asymptotic behavior of gauge theories, and critical exponents near second-order phase transitions.
[...]
$$e^2_{eff}(p^2)=\frac{e^2_R}{1-\frac{e^2_R}{12 \pi^2}ln\frac{p^2}{\mu^2}}$$
$$e_R=e_{eff}(\mu)$$
 This is the effective coupling including the 1-loop 1PI graphs, This is called leading- logarithmic resummation.
Once all of these 1PI 1-loop contributions are included, the next terms we are missing should be subleading in some expansion. [...] However, it is not obvious at this point that there cannot be a contribution of the form $ln^2\frac{p^2}{\mu^2}$ from a 2-loop 1PI graph. To check, we would need to perform the full zero order calculation, including graphs with loops and counterterms. As you might imagine, trying to resum large logarithms beyond the leading- logarithmic level diagrammatically is extremely impractical. The RG provides a shortcut to systematic resummation beyond the leading-logarithmic level.
Another example: In supersymmetry you usually have nice (theoretically predicted) renormalization conditions at very high energy for your couplings (this is because you expect some ordering principle from the underlying fundamental theory, string theory for instance). To get predictions for the couplings you must RG evolve all the couplings down to electroweak scale or scales where human perform experiments. Using RG equations ensures that the loop expansions for calculations of observables will not suffer from very large logarithms.
A suggested reference: Schwartz, Quantum Field Theory and the Standard model. See for instance pag. 422 and pag.313.
Best Answer
The running coupling $\lambda(\mu)$, as a function of renormalization scale $\mu$, does run negative for large $\mu$ in the SM if the Higgs is not too heavy. But "renormalization scales" are not particularly physical things to talk about. A more physical quantity is the renormalization-group improved effective Higgs potential, $V(H)$. For large values of $H$, this is roughly just $\lambda(\left|H\right|) \left|H\right|^4$ (i.e., just evaluate the quartic at an RG scale equal to $H$). In other words, you can approximately equate the statement "$\lambda$ runs negative at large RG scales" with the statement "the Higgs potential turns over at large Higgs VEVs." The latter statement is clearly more connected to the existence of some kind of instanton giving rise to vacuum decay.
There's a long literature on this back to the 80s, if not earlier. You might start with hep-ph/0104016 by Isidori, Ridolfi, and Strumia and work your way in either direction through the literature....