If the set of $x$ for which $g(x)=1$ is not a set of $\mu$ measure $0$, then $M_{g}$ has an eigenvalue of $1$. But an eigenvalue of $1$ is impossible for $(A+iI)(A-iI)^{-1}$ because $(A+iI)(A-iI)^{-1}x=x$ implies
$$
x+2i(A-iI)^{-1}x=x \implies (A-iI)^{-1}x=0 \implies x = 0.
$$
If $fVx \in L^{2}$ for some $x$, then the following holds for some $y$
$$
i\frac{g+1}{g-1}Vx = Vy \in L^{2} \\
i(g+1)Vx = (g-1)Vy \\
i(U+I)x = (U-I)y
$$
Now, if you're careful, you can show that $x$ is in the range of $(A-iI)^{-1}$, which is the same as the domain of $A$. To prove this, use the following in the above and solve for $x=(A-iI)^{-1}z$:
$$
U = (A+iI)(A-iI)^{-1}= I-2i(A-iI)^{-1}.
$$
The steps are basically reversible back up to $fVx \in L^{2}$.
The proof of the spectral theorem for normal operators doesn't rely on the proof of the spectral theorem for self-adjoint operators, instead the proofs are basically identical.
How do you construct the spectral measure in the self-adjoint case? One way to do it is to look at the $C^*$-algebra generated by the self-adjoint operator $T$ on the Hilbert space $X$, let's call it $C^*(T)$. Since $C^*(T)$ is commutative, by Gelfand theory it is isomorphic to the algebra of continuous functions on the spectrum of $T$, $C(\sigma(T))$. Given $x,y\in H$, the map $C^*(T)\to\mathbb C$ given by $S\mapsto \langle Sx,y\rangle$ is a bounded linear functional, hence defines a Borel measure $\mu_{x,y}$ on $\mathbb R$, supported in $\sigma(T)$. Using these measures, we can extend the isomorphism $C(\sigma(T))\to C^*(T)$ to a homomorphism of $B(\mathbb R)\to \mathcal B(X)$ from the algebra bounded Borel functions on $\mathbb R$ to bounded operators on $X$. The spectral measure is just the restriction of this homomorphism to characteristic functions of Borel sets.
If now $T$ is normal, $C^*(T)$ is still commutative, and (again by Gelfand theory) is isomorphic to $C(\sigma(T))$, where now $\sigma(T)\subset\mathbb C$. Given $x,y\in X$, the measure $\mu_{x,y}$ is now a Borel measure on $\mathbb C$ supported in $\sigma(T)$, and in this way we obtain a homomorphism $B(\mathbb C)\to\mathcal B(X)$ from the algebra of bounded Borel functions on $\mathbb C$ to $\mathcal B(X)$, and obtain the spectral measure.
The rest of the proof of the spectral theorem should be the same.
EDIT
Hopefully this will help translate my response to language you are familiar with.
Firstly, yes, $C^*(T)$ is as you have defined it.
Secondly, basically the only difference between the two cases is that if $T$ is normal, we define the map $\Phi_0$ from polynomials in two variables $p=p(z,\overline z)$ to $B(X)$ by $\sum_{ij}a_{ij}z^i\overline z^j\mapsto \sum_{ij}a_{ij}T^i(T^*)^j$ and extend this by Stone-Weierstrass to a map $\Phi:C(\sigma(T))\to B(X)$. We need to consider bivariate polynomials in the normal case because if the set $X\subset\mathbb C$ is not a subset of $\mathbb R$, polynomials in one variable are not closed under conjugation, hence the Stone-Weierstrass theorem cannot be applied.
Thirdly, there are plenty of books out there that prove the spectral theorem for normal operators, leaving the case for self-adjoint operators as a corollary, but most of the one's I'm familiar with develop some basic $C^*$-algebra theory to make the proofs more transparent. See for instance Conway's or Rudin's functional analysis books, or Murphy's $C^*$-algebras and operator theory.
Best Answer
(1). It should be: "for otherwise $(p(T)-p(\lambda) I)^{-1} q(T)$ would give a bounded inverse for $T - \lambda I$."
(2). $p(x_i) - \mu = 0$ by substituting $x = x_i$ in the equation $p(x) - \mu =c \prod_{i} (x - x_i)$
EDIT: Note that all polynomial functions of $T$ and their inverses (when they exist) commute. Thus from $q(T) r(T) = r(T) q(T)$, if $r(T)^{-1}$ exists we multiply by $r(T)^{-1}$ on both sides to get $r(T)^{-1} q(T) = q(T) r(T)^{-1}$.
The point in (1) is that $p(x) - p(\lambda) = (x -\lambda) q(x)$ means $p(T) - p(\lambda) I = (T - \lambda I) q(T) = q(T) (T - \lambda I)$, and so if $(p(T) - p(\lambda) I)^{-1}$ existed we could multiply the equation by that to get $I = (T - \lambda I) q(T) (p(T) - p(\lambda) I)^{-1} = q(T) (p(T) - p(\lambda) I)^{-1} (T - \lambda I)$, i.e. $q(T) (p(T) - p(\lambda)I)^{-1}$ is an inverse for $T - \lambda I$.