I have seen the definition of a modular form, but it seems obscure to me. I get the impression that if I were to read a lot about them, eventually I would see how they can be used. I am curious about the ways in which modular forms are applied. How are they used? What are some important theorems of intrinsic interest that can be (relatively easily) obtained by using them? Are there any that I should look at in particular?
[Math] What are modular forms used for
modular-forms
Related Solutions
It might help to go back to the definition of Hecke operators in level $1$ in Serre's Course in arithmetic. For a prime $p$ and a lattice $\Lambda$, the $p$the Hecke corresondence (I forget if Serre uses exactly this terminology) takes $\Lambda$ to $\sum \Lambda'$, where $\Lambda'$ runs over all index $p$ sublattices of $\Lambda$.
This is a multi-valued function from lattices to lattices (it is $1$-to-$p+1$-valued).
Now lattices (mod scaling) are just elliptic curves: $\Lambda \mapsto \mathbb C/\lambda$. And so we can also think of this as a multi-valued map from the moduli space of ellitic curves (i.e. the $j$-line, or $Y_0(1)$ if you like) to itself.
How to describe a multi-valued map more geometrically? Think about its graph inside $Y_0(1) \times Y_0(1)$. The graph of a function has the property that its projection onto the first factor is an isomorphism. The graph of a $p+1$-valued function has the property that its projection onto the first factor is of degree $p+1$.
This graph has an explicit description: it is just $Y_0(p)$ (the modular curve of level $\Gamma_0(p)$). Remember that $Y_0(p)$ parameterizes pairs $(E,E')$ of $p$-isogenous curves. We embed it into $Y_0(1) \times Y_0(1)$ in the obvious way, by mapping the pair $(E,E')$ (thought of as an element of $Y_0(p)$) to $(E,E')$ (thought of as an element of the product).
In terms of the upper half-plane variable $\tau$, one can think of this map as being $\tau \bmod \Gamma_0(p)$ maps to $\bigl(\tau \bmod SL_2(\mathbb Z), p\tau \bmod SL_2(\mathbb Z) \bigr).$
So we have recast Serre's description of the $p$th Hecke operator in terms of a correspondence on lattices in the geometric language of correspondences on curves: i.e. the $p$th Hecke operator is given by a mutli-valued morphism from $Y_0(1)$ to itself, rigorously encoded by its graph thought of as a curve in the product surface $Y_0(1) \times Y_0(1)$, which is in fact isomorphic to $Y_0(p)$.
We can easily compactify the situation, to get $X_0(p)$ embedding as the graph of a correspondence on $X_0(1) \times X_0(1)$.
[Caveat: Actually the map $Y_0(p) \to Y_0(1) \times Y_0(1)$ need not be an embedding; it is a birational map onto its image, but the image can be singular (and the same applied with $X$'s instead of $Y$'s). This is because the point on $Y_0(p)$ is not just the pair $(E,E')$, but the additional data of the $p$-isogeny $E\to E'$, which is not uniquely determined up to isomorphism in some exceptional cases. But this is a technical point which is not worth fussing about at the beginning.]
The advantage of having a geometric correspondence in sight is that whenever we apply any kind of linearization functor to our curve, the correspondence will turn into a genuine single valued operator.
The point is that if we have a multi-valued function from one abelian group to another, we can just add up the values to get a single-valued function.
So the correspondence $T_p$ induces genuine maps from the Jacobian of $X_0(1)$ to itself, or from the cohomology of $X_0(1)$ to itself, or from the space of holomorphic differentials on $X_0(1)$ to itself.
Now actually in the case of $X_0(1)$, which has genus zero, the Jacobian and the space of holomorphic differentials are trivial. But we can do everything with $X_0(N)$ or $X_1(N)$ in place of $X_0(1)$ for any $N$, and all the same remarks apply.
Remembering that the holomorphic differentials on $X_0(N)$ are the weight two cuspforms of level $N$, one can compute that the $p$th Hecke correspondence gives rise to the usual $p$th Hecke operator on cuspforms in this way.
What's the point of considering the correspondence? There are many; here's one:
if we reduce everything mod $p$, we get a mod $p$ correspondence on the mod $p$ reduction of $X_0(N)$, whose graph is the mod $p$ reduction of $X_0(Np)$. But this latter reduction is well-known to be singular, and in fact reducible; it is the union of two copies of $X_0(N)$. Thus the $p$th Hecke correspondence mod $p$ decomposes as the sum of two simpler correspondences, which one checks to be the Frobenius morphism from $X_0(N)$ Mod $p$ to iself, and its dual.
This is the Eichler--Shimura congruence relation (in some form it actually goes back to Kronecker), and it underlies the relationship between $T_p$-eigenvalues and the trace of Frobenius in the $2$-dimensional Galois reps. attached to Hecke eigenforms.
Some MO posts which are vaguely relevant:
The theory of modular forms arose out of the study of elliptic integrals (as did the theory of elliptic curves, and much of modern algebraic geometry, and indeed much of modern mathematics). People understood that (complete) elliptic integrals (which we would think of as the number obtained by integrating a de Rham cohomology class, e.g. the one associated to the holomorphic differential on an elliptic curve, over a homology class on the curve) depended on an invariant (what we would think of as the $j$-invariant of the elliptic curve, although historically people used other invariants, often depending on some auxiliary level structure, such as $\lambda$, or $k$ (the square-root of $\lambda$)). This invariant was called the modulus (which is the origin of the adjective modular in this context).
People knew that if you replaced an elliptic curve by an $N$-isogenous one, then the elliptic integral would be multiplied by $N$ (in terms of $\mathbb C/\Lambda$, the elliptic integral is just one of the basis elements for $\Lambda$, and multiplying this by $N$, while keeping the other one fixed, gives a new elliptic curve related to the original one by an $N$-isogeny). They asked themselves how they could describe the modulus for this $N$-isogenous elliptic curve (or integral) in terms of the original one. This led them to find explicit equations for the modular curves $X_0(N)$ (for small values of $N$).
With these kinds of investigations (and remember, these were brilliant people --- Jacobi, Kronecker, Klein, just to mention some spanning a good part of the 19th century), it was natural that they were led to modular forms as well as modular functions (as one example, the Taylor coefficients of elliptic functions give modular forms; as another, the coordinates --- say with respect to Weierstrass elliptic functions --- of $N$-torsion points give level $N$ modular forms).
So all these investigations grew out of the study of elliptic integrals, but became intimately connected with the invention of algebraic topology, the development of complex analysis (by Riemann, and then Schwarz, and then the uniformization theorem), the development of hyperbolic geometry; basically all the fundamental mathematics of the 19th century that then drove much of the developments of 20th century mathematics.
The connections with arithmetic were also observed early on. Jacobi already introduced theta series and saw the relationship with counting representations by quadratic forms (e.g. he proved that the number of ways of writing $n \geq 0$ as a sum of four squares is equal to $\sum_{d | n, 4 \not\mid d} d$, using weight $2$ modular forms on $\Gamma_0(4)$).
But Kronecker (and maybe Abel, Eisenstein and even Gauss before him) also knew that modular forms, when evaluated at CM elliptic curves (i.e. at quadratic imaginary values of $\tau$) gave algebraic number values in some contexts. Gauss was led to this by the analogy with cyclotomy: $N$-torsion on an elliptic curve was analogous to $N$th roots of $1$ on the unit circle, and the analogy is tighter when the elliptic curve has CM, because then the $N$-torsion points become a cyclic module over the ring of CMs, just as the $N$th roots of $1$ are a cyclic module over $\mathbb Z$ (i.e. a cyclic group).
Kronecker (and again, maybe people before him) realized that CM elliptic curves corresponded to lattices $\Lambda \subset \mathbb C$ that belong to ideal classes in quadratic imaginary fields, and so saw a relationship between CM elliptic curves and class field theory for quadratic imaginary fields (Kronecker's Jugendtraum). This also related to the previous work on evaluating modular forms at CM points.
All this is just to say that even in the 19th century the subject was very deep, and already very connected to number theory, as well as everything else.
Ramanujan knew the theory very well, and discovered new phenomena (e.g. his conjectures on the behavious of $\tau(n)$, defined by $\Delta = q\prod_{n=1}^{\infty} (1- q^n)^{24} = \sum_{n=1}^{\infty} \tau(n) q^n$). Mordell proved Ramanujan's conjecture on the multiplicative nature of $\tau$, and Hecke introduced his operators to systematize Mordell's method of proof.
At this point, the subject moved in a more representation-theoretic and analytic direction, with the generalization to automorphic forms. With the discovery in the 50s, 60s, and 70s of the modularity conjecure for elliptic curves over $\mathbb Q$, and related ideas, the arithmetic theory of modular forms became a central topic again. See this answer on MO for more on that.
Mazur's theorem on torsion points on elliptic curves over $\mathbb Q$ is one of the deepest results that comes from thinking of $X_0(N)$ and $X_1(N)$ directly in modular terms. But already the proofs are more automorphic in nature, and are focussed on the relationships between modular forms, particularly Hecke eigenforms, and Galois representations. That's where the modern focus primarily lies. You can see some of the other answers linked from my webpage (here) for more on that.
Let me close this long discussion by just saying that the passage to Galois representations as a focus is a natural development from Kronecker's Jugendtraum, but reflects a shifting of attention from abelian class field theory for quadratic imaginary fields to non-abelian (more precisely, $\mathrm{GL}_2$) class field theory for $\mathbb Q$. (Note that the former embeds in the latter, since the indcution of a Galois character of a quadratic extension gives a two-dimensional rep. of $G_{\mathbb Q}$.)
Finally, let me mention that the main theme of Mazur's article is congruences between cuspforms and Eisenstein series (this is what the Eisenstein ideal measures), and so it's hard to have one without the other. (In some sense, Eisenstein series are like the trivial Dirichlet character mod $N$, while cuspforms are like the non-trivial characters. Which is more important depends on what you are doing; in many problems you need to consider both.)
Best Answer
One of the simplest applications (and quickest to get to) is to representation numbers of quadratic forms. E.g. Jacobi's formula, that the number of ways of writing a natural number $n$ as the sum of four squares is equal to $8 \sum_{d|n, 4 \not\mid d} d$, was originally proved using modular forms, and I think this is still the most versatile method of proof.
For more general quadratic forms, one can't necessarily get as precise formulas (so-called cuspforms introduce error terms which don't admit explicit formulas), but one gets approximations (and the Ramanujan--Petersson conjecture on growth of Fourier coeffs. of cuspforms plays a role in bounding the error terms coming from cuspforms).
Some of this (although not Jacobi's formula itself) can be found in Serre's Course in arithmetic, which is the nicest treatment for a beginner.
There are also the applications to the theory of elliptic curves (and then to FLT) mentioned in the comments. For example, the best results in the direction of BSD (such as Gross--Zagier, or Kato's results) rely on the connection between modular forms and elliptic curves.