The answer to 1) is that this is a special case of a broader phenomenon for symplectic resolutions (though I think some features are specific to the Grothendieck-Springer case.) For instance you have such a deformation for quiver varieties. I'm not sure what general results have been proven, but I think you can find some statements in papers of Namikawa, e.g. https://arxiv.org/abs/0902.2832. In any case, there are people on this website who are much better suited for answering this question - hopefully one of them can elaborate.
For 2), the Grothendieck-Springer resolution also naturally arises in the context of Beilinson-Bernstein. Namely, note that the usual versions of Beilinson-Bernstein involve fixing an integral Harish-Chandra central character and
(at an imprecise level) relate the category of $U\mathfrak{g}$ modules at that central character and a category of $\mathcal{D}$-modules at the corresponding dominant twist. These two sides roughly come from quantizing the nilpotent cone and the cotangent bundle of the flag variety (this is the "semiclassical shadow" you mention.)
One the other hand, you can consider all possible twists at once. On the $\mathcal{D}$-module side, you have a $\mathfrak{h}^*$'s worth of twists, but on the universal enveloping algebra side, you only have a $\mathfrak{h}^*/W$'s worth of twists. That's because this situation corresponds exactly to quantizing the Grothendieck-Springer resolution, with its natural Poisson structure (instead of a symplectic structure). More explicitly, you have a sheaf $\tilde{\mathcal{D}},$ with center $\operatorname{Sym}(\mathfrak{h})$, on $G/B$ such that if you take the quotient with central character $\lambda$, you get the sheaf of $\lambda$-twisted differential operators. Taking associated graded transforms $\tilde{\mathcal{D}}$ into the symmetric algebra associated to $\tilde{\mathfrak{g}}$, viewed as a vector bundle on $G/B.$ On the other side, taking associated graded of $U\mathfrak{g}$ recovers functions on $\mathfrak{g}$. Therefore, if you consider the big (i.e. simulatneously w.r.t. all twists) localization functor $M\mapsto M\otimes_{U\mathfrak{g}}\tilde{\mathcal{D}},$ this quantizes pullback along the Grothendieck-Springer resolution. Because of the $|W|$-to-$1$ nature of the map, the geometry here is well-suited for the study of intertwining functors (which compare localization at weights in the same Weyl group orbit), and is used e.g. in Beilinson-Ginzburg https://arxiv.org/abs/alg-geom/9709022.
BTW, you can explicitly see this "semiclassical limit" in the characteristic $p$ setting, where quantum is much closer to classical. For this see the sequence of papers starting with Bezrukavnikov-Mirkovic-Rumynin, https://arxiv.org/abs/math/0205144. Again, there the difference between Grothendieck-Springer and Springer comes from what versions of central character conditions you want to impose. When you want to study all central characters at once (or even if you just care about the formal neighborhood of one central character, i.e. requiring the Harish-Chandra center to act via a generalized central character instead of strict equality), you need Grothendieck-Springer.
Let me mention one last setting where the difference between Springer and Grothendieck-Springer appears, when you want to relate equivariant coherent (derived) categories of Springer-like gadgets and categories of perverse sheaves on affine flag varieties/grassmannians (you can think of these theorems as cases of geometric Langlands on $P^1$ with points of tame ramification, composed with the long intertwining functor.) The decategorified version is a a theorem of Kazhdan-Lusztig computing equivariant K-theory of the Steinberg variety to be the affine hecke algebra (see chapter 7 of Chriss-Ginzburg.)
There are many similar such theorems in papers of Bezrukavnikov and others - let me take a specific such theorem which appears in https://arxiv.org/abs/1209.0403v4. Recall that Beilinson & Bernstein relate category O to perverse sheaves on the flag variety constant along the Schubert stratification. There are various versions of this latter category, e.g. I can take perverse sheaves on $G/B$ equivariant with respect to either $N$ or $B$,. Now let me move to the affine setting, so that $G$ and $B$ get replaced by the loop group $G(K)$ and the Iwahori $I.$ I can take either $I$- or $I_0$- (the unipotent radical of $I$) equivariant perverse sheaves on $G(K)/I$, the affine flag variety, and these two categories (or rather their derived versions) I will denote by $D_{II}$ and $D_{I_0I}$. Now Bezrukavnikov's theorem matches $D_{II}$ and $D_{I_0I}$ up with the derived categories of $G_L$-equivariant sheaves on $\tilde{\mathcal{N}}_L x_{\mathfrak{g}_L}\tilde{\mathcal{N}}_L$ and $\tilde{\mathcal{N}}_L x_{\mathfrak{g}_L}\tilde{\mathfrak{g}}_L$, respectively. So here which one to use between Springer and Grothendieck-Springer depends on whether you are considering $I-$ or $I_0$-equivariant sheaves.
Best Answer
I will try to answer the first question only.
As in the remarks, the canonical reference is
Beauville, Narasimhan, Ramanan, Spectral curves and the generalised theta divisor. J. Reine Angew. Math. 398 (1989), 169–179. https://doi.org/10.1515/crll.1989.398.169
First part of your question, about the discriminant locus :
Let $s=(s_1,\cdots,s_N)\in \mathcal A$. This defines a morphism $T_C \to T^N_C$ where $T^i_C$ is the line bundle associated to the dual of $\Omega^{\otimes i}_C$. The corresponding spectral curve $C_s \to C$ is the pullback of the zero section $0:C\to T^N_C$ : that is
$$ C_s=C\times_{T^N_C} T_C \; .$$
Let $x$ in $C$. Then an easy application of the Jacobian criterion shows that $C_s$ is singular at $(x,0)$ if and only if $\operatorname{div}(s_N)\geq 2(x)$ and $\operatorname{div}(s_{N-1})\geq (x)$ [warning : BNR Remark 3.5 is wrong]. By Riemann-Roch, at least if $N$ is large enough, this defines a locus of codimension $3$ in $\mathcal A$. So yes, the so-called discriminant locus is not reduced to $0$.
Second part of your question : the BNR correspondence (Proposition 3.6) establishes an equivalence between Higgs bundles with characteristic polynomial $s$ and torsion free sheaves of rank $1$ on $C_s$ (in BNR, this is stated for $C_s$ integral, but Schaub as extended this for any spectral curve). So yes, the fiber of the Hitchin map is a compactification of the Jacobian of $C_s$. This is even one of the main theorems of
Schaub, Daniel Courbes spectrales et compactifications de jacobiennes. Math. Z. 227 (1998), no. 2, 295–312. https://doi.org/10.1007/PL00004377 .