It is true that a lot of quantum mechanics can be taught and understood without much knowledge of the mathematical foundations, and usually it is. Since QM is a mandatory class at many faculties that future experimental physicists have to attend, too, this also makes sense. But for future theoretical and mathematical physicists, it may pay off to learn a little bit about the math, too.
A little anecdote: John von Neumann once said to Werner Heisenberg that mathematicians should be grateful for QM, because it led to the invention of a lot of beautiful mathematics, but that mathematicians repaid this by clarifying, e.g., the difference between a selfadjoint and a symmetric operator. Heisenberg asked: "What is the difference?"
Suppose you want to calculte exp(A). Why don't you define exp(A):=1+A+1/2 A^2 + ... and require convergence with respect to the operator norm.
That's correct. The benefit of the spectral theorem is that you can define f(A) for any selfadjoint (or more generally, normal) operator for any bounded Borel function. This comes in handy in many proofs in operator theory.
In addition to that I've heard that the spectral theorem gives a full description of all self-adjoint operators. Now why is that the case? I mean okay..there's a one to one correspondence between self-adjoint operators and spectral measures..
That's correct, too. Spectral measures are much much simpler objects than selfadoint operators, that's why. Futhermore, you can use the spectral theorem to prove that every selfadjoint operator is unitarily equivalent to a multiplication operator (multiply f(x) by x). From an abstract viewpoint, this is a very satisfactory characterization. It does not help much for concrete calcuations in QM, though.
BTW: On a more advanced level, you'll need to understand the spectral theorem to understand what a mass gap is in Yang-Mills Theory (millenium problem).
Hint: In QFT in Minkowski-Spacetime, one usually assumes that there is a continuous representation of the Poincaré group, especially of the commutative subgroup of translations, on the Hilbert space that contains all physical states. The operators that form the representation have a common spectral measure, this is an application of the SNAG-theorem. The support of this spectral measure is bounded away from zero, that's the definition of the mass gap.
The spectral theorem is that, if $A: D(A) \to {\cal H}$ is a selfadjoint operator, where $D(A) \subset {\cal H}$ is a dense subspace, then there exists a unique projector-valued measure $P^{(A)}$ on the Borel sets of $\mathbb{R}$
such that $$A = \int_{\mathbb R} \lambda dP^{(A)}(\lambda)\:.$$
As a consequence (this is a corollary or a definition depending on the procedure)
$$f(A) = \int_{\mathbb R} f(\lambda) dP^{(A)}(\lambda) \tag{1}$$
for every $f: {\mathbb R} \to {\mathbb C}$ Borel measurable. Taking $f(x) =1$ for all $x\in {\mathbb R}$ we have
$$I = \int_{\mathbb R} dP^{(A)}(\lambda)\:.$$
For selfadjoint operators admitting a Hilbert basis of eingenvectors $\psi_{\lambda, d_\lambda}$, $\lambda \in \sigma_p(A)$ and $d_\lambda$ accounting for the dimension of the eigenspace with eigenvalue $\lambda$, the identity above reads (referring to the strong operator-topology)
$$f(A) = \sum_{\lambda, d_\lambda} f(\lambda) |\psi_{\lambda, d_\lambda}\rangle\langle \psi_{\lambda, d_\lambda} |\:, \tag{2}$$
with the special case
$$I = \sum_{\lambda, d_\lambda} |\psi_{\lambda, d_\lambda}\rangle\langle \psi_{\lambda, d_\lambda} |\:. \tag{3}$$
In summary Eqs.(1) and (2) are the central identities, Eq.(3) is just a special case.
Given an orthonormal complete basis $\{\psi_n\}_{n \in \mathbb N} \subset {\cal H}$, one can always define ad hoc a selfadjoint operator $A$ (with no physical meaning in general) to implement the identities above:
$$A = \sum_{n \in \mathbb{N}} \lambda_n |\psi_{n}\rangle\langle \psi_{n} |$$ for a given arbitrary choice of real numbers $\lambda_n$.
The domain of $A$ is
$$\left\{\psi \in {\cal H} \: \left| \: \sum_{n} |\lambda_n|^2 |\langle \psi_n| \psi \rangle|^2 < +\infty\right. \right\}$$
Best Answer
It is a good idea to first consider bounded operators, or even operators on finite-dimensional Hilbert spaces. In this case, the spectrum is discrete (there are only eigenvalues, no continuous spectrum), let us denote the eigenvalues by $\lambda_n$. In Physicists' terms, the projection-valued measure is then simply $$ \mathrm d\mu^A(\lambda) = \sum_n \delta(\lambda - \lambda_n)\, |\lambda_n\rangle\langle\lambda_n|\, \mathrm d\lambda $$ and the expression you have given reduces down to the familiar $$ A = \sum_n \lambda_n\, |\lambda_n\rangle\langle\lambda_n| . $$ We see that $A$ is diagonal in the basis of eigenvectors.
The integral $\int_{\sigma(A)} \lambda\,d\mu^A(\lambda)$ generalizes this to a scenario with operators with continuous spectrum. In the case of continuous spectrum, $\int_a^{a+\epsilon} \mathrm d\mu^A(\lambda)$ goes to zero (i.e., the zero-operator) for $\epsilon \to 0$, just like with any "normal" integral with non-atomic measures (no delta functions).
I am not so familiar with direct integrals, but it is probably helpful to also think about this formulation in the case of a finite-dimensional space.
Edit in response to the comment
In a finite-dimensional system, expanding an operator $A$ in the form $A = \sum_n \lambda_n\, |\lambda_n\rangle\langle\lambda_n|$ is the diagonalization, expressed in a basis-independent way. You can immediately see that the matrix representation of $A$ in the $|\lambda_n\rangle$-basis is diagonal, and what the eigenvalues and eigenvectors are. What I was trying to explain above is that the expansion $A = \int_{\sigma(A)} \lambda\,d\mu^A(\lambda)$ generalizes that, so $A$ is diagonalizable in this sense.
It is useful mostly in the same way that the expansion $A = \sum_n \lambda_n\, |\lambda_n\rangle\langle\lambda_n|$ is useful in finite dimensions. For example, we can easily calculate functions $f(A)$: $$ f(A) = \int_{\sigma(A)} f(\lambda)\,d\mu^A(\lambda). $$