The answer to 1) is that this is a special case of a broader phenomenon for symplectic resolutions (though I think some features are specific to the Grothendieck-Springer case.) For instance you have such a deformation for quiver varieties. I'm not sure what general results have been proven, but I think you can find some statements in papers of Namikawa, e.g. https://arxiv.org/abs/0902.2832. In any case, there are people on this website who are much better suited for answering this question - hopefully one of them can elaborate.
For 2), the Grothendieck-Springer resolution also naturally arises in the context of Beilinson-Bernstein. Namely, note that the usual versions of Beilinson-Bernstein involve fixing an integral Harish-Chandra central character and
(at an imprecise level) relate the category of $U\mathfrak{g}$ modules at that central character and a category of $\mathcal{D}$-modules at the corresponding dominant twist. These two sides roughly come from quantizing the nilpotent cone and the cotangent bundle of the flag variety (this is the "semiclassical shadow" you mention.)
One the other hand, you can consider all possible twists at once. On the $\mathcal{D}$-module side, you have a $\mathfrak{h}^*$'s worth of twists, but on the universal enveloping algebra side, you only have a $\mathfrak{h}^*/W$'s worth of twists. That's because this situation corresponds exactly to quantizing the Grothendieck-Springer resolution, with its natural Poisson structure (instead of a symplectic structure). More explicitly, you have a sheaf $\tilde{\mathcal{D}},$ with center $\operatorname{Sym}(\mathfrak{h})$, on $G/B$ such that if you take the quotient with central character $\lambda$, you get the sheaf of $\lambda$-twisted differential operators. Taking associated graded transforms $\tilde{\mathcal{D}}$ into the symmetric algebra associated to $\tilde{\mathfrak{g}}$, viewed as a vector bundle on $G/B.$ On the other side, taking associated graded of $U\mathfrak{g}$ recovers functions on $\mathfrak{g}$. Therefore, if you consider the big (i.e. simulatneously w.r.t. all twists) localization functor $M\mapsto M\otimes_{U\mathfrak{g}}\tilde{\mathcal{D}},$ this quantizes pullback along the Grothendieck-Springer resolution. Because of the $|W|$-to-$1$ nature of the map, the geometry here is well-suited for the study of intertwining functors (which compare localization at weights in the same Weyl group orbit), and is used e.g. in Beilinson-Ginzburg https://arxiv.org/abs/alg-geom/9709022.
BTW, you can explicitly see this "semiclassical limit" in the characteristic $p$ setting, where quantum is much closer to classical. For this see the sequence of papers starting with Bezrukavnikov-Mirkovic-Rumynin, https://arxiv.org/abs/math/0205144. Again, there the difference between Grothendieck-Springer and Springer comes from what versions of central character conditions you want to impose. When you want to study all central characters at once (or even if you just care about the formal neighborhood of one central character, i.e. requiring the Harish-Chandra center to act via a generalized central character instead of strict equality), you need Grothendieck-Springer.
Let me mention one last setting where the difference between Springer and Grothendieck-Springer appears, when you want to relate equivariant coherent (derived) categories of Springer-like gadgets and categories of perverse sheaves on affine flag varieties/grassmannians (you can think of these theorems as cases of geometric Langlands on $P^1$ with points of tame ramification, composed with the long intertwining functor.) The decategorified version is a a theorem of Kazhdan-Lusztig computing equivariant K-theory of the Steinberg variety to be the affine hecke algebra (see chapter 7 of Chriss-Ginzburg.)
There are many similar such theorems in papers of Bezrukavnikov and others - let me take a specific such theorem which appears in https://arxiv.org/abs/1209.0403v4. Recall that Beilinson & Bernstein relate category O to perverse sheaves on the flag variety constant along the Schubert stratification. There are various versions of this latter category, e.g. I can take perverse sheaves on $G/B$ equivariant with respect to either $N$ or $B$,. Now let me move to the affine setting, so that $G$ and $B$ get replaced by the loop group $G(K)$ and the Iwahori $I.$ I can take either $I$- or $I_0$- (the unipotent radical of $I$) equivariant perverse sheaves on $G(K)/I$, the affine flag variety, and these two categories (or rather their derived versions) I will denote by $D_{II}$ and $D_{I_0I}$. Now Bezrukavnikov's theorem matches $D_{II}$ and $D_{I_0I}$ up with the derived categories of $G_L$-equivariant sheaves on $\tilde{\mathcal{N}}_L x_{\mathfrak{g}_L}\tilde{\mathcal{N}}_L$ and $\tilde{\mathcal{N}}_L x_{\mathfrak{g}_L}\tilde{\mathfrak{g}}_L$, respectively. So here which one to use between Springer and Grothendieck-Springer depends on whether you are considering $I-$ or $I_0$-equivariant sheaves.
Best Answer
One reason to emphasize the Springer resolution's role as a moment map is that it is the semiclassical shadow of Beilinson-Bernstein localization. More precisely passing to functions, the moment map description asserts that the Springer map is describing the Hamiltonian functions on the cotangent to the flag variety which generate the action of the Lie algebra. We may now quantize the cotangent bundle $T^* G/B$ to the ring of differential operators on $G/B$, and likewise quantize the dual space $g^*$ to the Lie algebra to the universal enveloping algebra $Ug$, so that the moment map describes the map from $Ug$ to global differential operators on the flag variety. What's truly significant about the Springer map (it's a birational, proper, symplectic [crepant] resolution of [rational] singularities) now translates into the Beilinson-Bernstein equivalence (for generic parameters) between $Ug$-modules and (twisted) D-modules on the flag variety, the cornerstone of geometric representation theory. There's now an entire subject (wonderfully represented in a workshop last week in Luminy) seeking to generalize all the features of this setup to other symplectic resolutions and their quantizations, viewed as the settings for "new representation theories" (the prime examples being Hilbert schemes and other quiver varieties).