It is true that a lot of quantum mechanics can be taught and understood without much knowledge of the mathematical foundations, and usually it is. Since QM is a mandatory class at many faculties that future experimental physicists have to attend, too, this also makes sense. But for future theoretical and mathematical physicists, it may pay off to learn a little bit about the math, too.
A little anecdote: John von Neumann once said to Werner Heisenberg that mathematicians should be grateful for QM, because it led to the invention of a lot of beautiful mathematics, but that mathematicians repaid this by clarifying, e.g., the difference between a selfadjoint and a symmetric operator. Heisenberg asked: "What is the difference?"
Suppose you want to calculte exp(A). Why don't you define exp(A):=1+A+1/2 A^2 + ... and require convergence with respect to the operator norm.
That's correct. The benefit of the spectral theorem is that you can define f(A) for any selfadjoint (or more generally, normal) operator for any bounded Borel function. This comes in handy in many proofs in operator theory.
In addition to that I've heard that the spectral theorem gives a full description of all self-adjoint operators. Now why is that the case? I mean okay..there's a one to one correspondence between self-adjoint operators and spectral measures..
That's correct, too. Spectral measures are much much simpler objects than selfadoint operators, that's why. Futhermore, you can use the spectral theorem to prove that every selfadjoint operator is unitarily equivalent to a multiplication operator (multiply f(x) by x). From an abstract viewpoint, this is a very satisfactory characterization. It does not help much for concrete calcuations in QM, though.
BTW: On a more advanced level, you'll need to understand the spectral theorem to understand what a mass gap is in Yang-Mills Theory (millenium problem).
Hint: In QFT in Minkowski-Spacetime, one usually assumes that there is a continuous representation of the Poincaré group, especially of the commutative subgroup of translations, on the Hilbert space that contains all physical states. The operators that form the representation have a common spectral measure, this is an application of the SNAG-theorem. The support of this spectral measure is bounded away from zero, that's the definition of the mass gap.
It is a good idea to first consider bounded operators, or even operators on finite-dimensional Hilbert spaces.
In this case, the spectrum is discrete (there are only eigenvalues, no continuous spectrum), let us denote the eigenvalues by $\lambda_n$.
In Physicists' terms, the projection-valued measure is then simply
$$ \mathrm d\mu^A(\lambda) = \sum_n \delta(\lambda - \lambda_n)\, |\lambda_n\rangle\langle\lambda_n|\, \mathrm d\lambda $$
and the expression you have given reduces down to the familiar
$$ A = \sum_n \lambda_n\, |\lambda_n\rangle\langle\lambda_n| . $$
We see that $A$ is diagonal in the basis of eigenvectors.
The integral $\int_{\sigma(A)} \lambda\,d\mu^A(\lambda)$ generalizes this to a scenario with operators with continuous spectrum. In the case of continuous spectrum, $\int_a^{a+\epsilon} \mathrm d\mu^A(\lambda)$ goes to zero (i.e., the zero-operator) for $\epsilon \to 0$, just like with any "normal" integral with non-atomic measures (no delta functions).
I am not so familiar with direct integrals, but it is probably helpful to also think about this formulation in the case of a finite-dimensional space.
Edit in response to the comment
Suppose we can rewrite some bounded self-adjoint operator in terms of an integral taken with respect to a projective measure, so what? How does that really help us or show that our operator is "diagonalizable" in some sense?
In a finite-dimensional system, expanding an operator $A$ in the form $A = \sum_n \lambda_n\, |\lambda_n\rangle\langle\lambda_n|$ is the diagonalization, expressed in a basis-independent way. You can immediately see that the matrix representation of $A$ in the $|\lambda_n\rangle$-basis is diagonal, and what the eigenvalues and eigenvectors are. What I was trying to explain above is that the expansion $A = \int_{\sigma(A)} \lambda\,d\mu^A(\lambda)$ generalizes that, so $A$ is diagonalizable in this sense.
It is useful mostly in the same way that the expansion $A = \sum_n \lambda_n\, |\lambda_n\rangle\langle\lambda_n|$ is useful in finite dimensions. For example, we can easily calculate functions $f(A)$:
$$ f(A) = \int_{\sigma(A)} f(\lambda)\,d\mu^A(\lambda). $$
Best Answer
I am not sure to understand the nature of the problem.
The spectral theorem, as it is a mathematical fact, holds also in QFT. It does not matter if we do not know how the Hilbert space is made, it is sufficient to know that it is a Hilbert space and that the used operator is selfadjoint. Regarding operator valued distributions $\phi$, the spectral theorem applies to (usually the closures of) the images of these distributions $\phi(f)$ when they are selfadjoint operators.
If the theorem did not hold, then we would conclude that the space of states is not Hilbert or the operator is not selfadjoint (more generally normal).