The theory of modular forms arose out of the study of elliptic integrals (as did the theory of elliptic curves, and much of modern algebraic geometry, and indeed much of modern mathematics). People understood that (complete) elliptic integrals (which we would think of as the number obtained by integrating a de Rham cohomology class, e.g. the one associated to the holomorphic differential on an elliptic curve, over a homology class on the curve) depended on an invariant (what we would think of as the $j$-invariant of the elliptic curve, although historically people used other invariants, often depending on some auxiliary
level structure, such as $\lambda$, or $k$ (the square-root of $\lambda$)). This invariant was called the modulus (which is the origin of the adjective modular in this context).
People knew that if you replaced an elliptic curve by an $N$-isogenous one,
then the elliptic integral would be multiplied by $N$ (in terms of $\mathbb C/\Lambda$, the elliptic integral is just one of the basis elements for $\Lambda$,
and multiplying this by $N$, while keeping the other one fixed, gives a new elliptic curve related to the original one by an $N$-isogeny). They asked themselves how they could describe the modulus for this $N$-isogenous elliptic curve (or integral) in terms of the original one. This led them to find explicit equations for the modular curves $X_0(N)$ (for small values of $N$).
With these kinds of investigations (and remember, these were brilliant people --- Jacobi, Kronecker, Klein, just to mention some spanning a good part of the 19th century), it was natural that they were led to modular forms as well as modular functions (as one example, the Taylor coefficients of elliptic functions give modular forms; as another, the coordinates --- say with respect to Weierstrass elliptic functions --- of $N$-torsion points give level $N$ modular forms).
So all these investigations grew out of the study of elliptic integrals, but became intimately connected with the invention of algebraic topology, the development of complex analysis (by Riemann, and then Schwarz, and then the uniformization theorem), the development of hyperbolic geometry; basically all
the fundamental mathematics of the 19th century that then drove much of the developments of 20th century mathematics.
The connections with arithmetic were also observed early on. Jacobi already introduced theta series and saw the relationship with counting representations by quadratic forms (e.g. he proved that the number of ways of writing $n \geq 0$ as a sum of four squares is equal to $\sum_{d | n, 4 \not\mid d} d$, using weight $2$ modular forms on $\Gamma_0(4)$).
But Kronecker (and maybe Abel, Eisenstein and even Gauss before him) also knew that modular forms, when evaluated at CM elliptic curves (i.e. at quadratic imaginary values of $\tau$) gave algebraic number values in some contexts. Gauss was led to this by the analogy with cyclotomy: $N$-torsion on an elliptic curve was analogous to $N$th roots of $1$ on the unit circle, and the analogy is tighter when the elliptic curve has CM, because then the $N$-torsion points become a cyclic module over the ring of CMs, just as the $N$th roots of $1$ are a cyclic module over $\mathbb Z$ (i.e. a cyclic group).
Kronecker (and again, maybe people before him) realized that CM elliptic curves corresponded to lattices $\Lambda \subset \mathbb C$ that belong to ideal classes in quadratic imaginary fields, and so saw a relationship between CM elliptic curves and class field theory for quadratic imaginary fields (Kronecker's Jugendtraum). This also related to the previous work on evaluating modular forms at CM points.
All this is just to say that even in the 19th century the subject was very deep, and already very connected to number theory, as well as everything else.
Ramanujan knew the theory very well, and discovered new phenomena (e.g. his conjectures on the behavious of $\tau(n)$, defined by $\Delta = q\prod_{n=1}^{\infty} (1- q^n)^{24} = \sum_{n=1}^{\infty} \tau(n) q^n$). Mordell proved Ramanujan's conjecture on the multiplicative nature of $\tau$, and Hecke introduced his operators to systematize Mordell's method of proof.
At this point, the subject moved in a more representation-theoretic and analytic direction, with the generalization to automorphic forms. With the discovery in the 50s, 60s, and 70s of the modularity conjecure for elliptic curves over $\mathbb Q$, and related ideas, the arithmetic theory of modular forms became a central topic again. See this answer on MO for more on that.
Mazur's theorem on torsion points on elliptic curves over $\mathbb Q$ is one of the deepest results that comes from thinking of $X_0(N)$ and $X_1(N)$ directly in modular terms. But already the proofs are more automorphic in nature, and are focussed on the relationships between modular forms, particularly Hecke eigenforms, and Galois representations. That's where the modern focus primarily lies. You can see some of the other answers linked from my webpage (here) for more on that.
Let me close this long discussion by just saying that the passage to Galois representations as a focus is a natural development from Kronecker's Jugendtraum, but reflects a shifting of attention from abelian class field theory for quadratic imaginary fields to non-abelian (more precisely, $\mathrm{GL}_2$) class field theory for $\mathbb Q$. (Note that the former embeds in the latter, since the indcution of a Galois character of a quadratic extension gives a two-dimensional rep. of $G_{\mathbb Q}$.)
Finally, let me mention that the main theme of Mazur's article is congruences between cuspforms and Eisenstein series (this is what the Eisenstein ideal measures), and so it's hard to have one without the other. (In some sense, Eisenstein series are like the trivial Dirichlet character mod $N$, while cuspforms are like the non-trivial characters. Which is more important depends on what you are doing; in many problems you need to consider both.)
Here are the things that you can do with the modular form $f$ corresponding to an elliptic curve $E$:
(a) Determine the number of points on $E$ mod $p$ by computing $a_p(f)$ (easy
for smallish primes via modular symbols computations).
(b) Compute (perhaps with some effort) a modular parameterization of $E$,
and then, by evaluating this at Heegner points, find a point of infinite order
on a twist $E_D$ of $E$, in the cases when this twist has rank one.
(c) Compute whether or not $L(E_D,1) = 0$ for every twist $E_D$ of $E$,
via modular symbols. If you grant BSD, this tells whether or not the
twist $E_D$ has infinitely many points.
I'm not sure what other facts about $E$ you are expecting to get. What is it
you would like to know about an elliptic curve in any case? For most people,
the rank (and especially whether or not it is positive) is the main thing, and
conjecturally this is what you can get from the $L$-function of $E$, which is
essentially inaccessible without modular forms, but is highly computable once
you know $f$. (And not just for $E$, but for all its twists.)
Maybe the other thing you might like to know is Sha of $E$, but this is not
proven to be finite in general. Nevertheless, modular forms can sometimes
be used to witness non-trivial elements of Sha. (Read about the theory of
``the visible part of Sha'', by Cremona and Mazur.)
Best Answer
Consider $\Gamma = SL_2(\mathbb{Z})$. Remember that $\gamma = \begin{pmatrix} a & b \cr c& d \end{pmatrix} \in \Gamma$ operates on the upper half plane by $T_\gamma : z \mapsto \dfrac{az+b}{cz+d}$. Write $\pi: H \to H/\Gamma$ for the quotient map.
What is a (meromorphic) differential form $\omega$ on $H/\Gamma$? It should be nothing but a (meromorphic) differential form $\tilde{\omega} := \pi^*\omega$ on $H$ which is invariant under $\Gamma$. Writing $\tilde{\omega} = f(z)dz$ this reads $$ T_\gamma^*\tilde{\omega} = T_\gamma^*(f(z)dz) = f(T_\gamma(z))dT_\gamma(z) = f\left(\dfrac{az+b}{cz+d} \right) d\left(\dfrac{az+b}{cz+d} \right) = f(\dfrac{az+b}{cz+d}) \frac{ad-bc}{(cz+d)^2} dz $$ So the invariance property $T_\gamma^*\tilde{\omega} = \tilde{\omega}$ translates to $f\left(\dfrac{az+b}{cz+d} \right) = (cz+d)^2f(z)$ which should be familiar.
So basically modular forms of weight 2 correspond to differential forms on the space $H/\Gamma$ parametrizing complex elliptic curves.
Obviously I omitted some important details like behaviour at infinity, so called cusps, or how to interpret higher weights modular forms (they correspond to sections $f(z)(dz)^n \in H^0(H/\Gamma,(\Omega^1_{H/\Gamma})^{\otimes n})$ of the tensor product sheaf). But I hope the principle is clear.