The theory of modular forms arose out of the study of elliptic integrals (as did the theory of elliptic curves, and much of modern algebraic geometry, and indeed much of modern mathematics). People understood that (complete) elliptic integrals (which we would think of as the number obtained by integrating a de Rham cohomology class, e.g. the one associated to the holomorphic differential on an elliptic curve, over a homology class on the curve) depended on an invariant (what we would think of as the $j$-invariant of the elliptic curve, although historically people used other invariants, often depending on some auxiliary
level structure, such as $\lambda$, or $k$ (the square-root of $\lambda$)). This invariant was called the modulus (which is the origin of the adjective modular in this context).
People knew that if you replaced an elliptic curve by an $N$-isogenous one,
then the elliptic integral would be multiplied by $N$ (in terms of $\mathbb C/\Lambda$, the elliptic integral is just one of the basis elements for $\Lambda$,
and multiplying this by $N$, while keeping the other one fixed, gives a new elliptic curve related to the original one by an $N$-isogeny). They asked themselves how they could describe the modulus for this $N$-isogenous elliptic curve (or integral) in terms of the original one. This led them to find explicit equations for the modular curves $X_0(N)$ (for small values of $N$).
With these kinds of investigations (and remember, these were brilliant people --- Jacobi, Kronecker, Klein, just to mention some spanning a good part of the 19th century), it was natural that they were led to modular forms as well as modular functions (as one example, the Taylor coefficients of elliptic functions give modular forms; as another, the coordinates --- say with respect to Weierstrass elliptic functions --- of $N$-torsion points give level $N$ modular forms).
So all these investigations grew out of the study of elliptic integrals, but became intimately connected with the invention of algebraic topology, the development of complex analysis (by Riemann, and then Schwarz, and then the uniformization theorem), the development of hyperbolic geometry; basically all
the fundamental mathematics of the 19th century that then drove much of the developments of 20th century mathematics.
The connections with arithmetic were also observed early on. Jacobi already introduced theta series and saw the relationship with counting representations by quadratic forms (e.g. he proved that the number of ways of writing $n \geq 0$ as a sum of four squares is equal to $\sum_{d | n, 4 \not\mid d} d$, using weight $2$ modular forms on $\Gamma_0(4)$).
But Kronecker (and maybe Abel, Eisenstein and even Gauss before him) also knew that modular forms, when evaluated at CM elliptic curves (i.e. at quadratic imaginary values of $\tau$) gave algebraic number values in some contexts. Gauss was led to this by the analogy with cyclotomy: $N$-torsion on an elliptic curve was analogous to $N$th roots of $1$ on the unit circle, and the analogy is tighter when the elliptic curve has CM, because then the $N$-torsion points become a cyclic module over the ring of CMs, just as the $N$th roots of $1$ are a cyclic module over $\mathbb Z$ (i.e. a cyclic group).
Kronecker (and again, maybe people before him) realized that CM elliptic curves corresponded to lattices $\Lambda \subset \mathbb C$ that belong to ideal classes in quadratic imaginary fields, and so saw a relationship between CM elliptic curves and class field theory for quadratic imaginary fields (Kronecker's Jugendtraum). This also related to the previous work on evaluating modular forms at CM points.
All this is just to say that even in the 19th century the subject was very deep, and already very connected to number theory, as well as everything else.
Ramanujan knew the theory very well, and discovered new phenomena (e.g. his conjectures on the behavious of $\tau(n)$, defined by $\Delta = q\prod_{n=1}^{\infty} (1- q^n)^{24} = \sum_{n=1}^{\infty} \tau(n) q^n$). Mordell proved Ramanujan's conjecture on the multiplicative nature of $\tau$, and Hecke introduced his operators to systematize Mordell's method of proof.
At this point, the subject moved in a more representation-theoretic and analytic direction, with the generalization to automorphic forms. With the discovery in the 50s, 60s, and 70s of the modularity conjecure for elliptic curves over $\mathbb Q$, and related ideas, the arithmetic theory of modular forms became a central topic again. See this answer on MO for more on that.
Mazur's theorem on torsion points on elliptic curves over $\mathbb Q$ is one of the deepest results that comes from thinking of $X_0(N)$ and $X_1(N)$ directly in modular terms. But already the proofs are more automorphic in nature, and are focussed on the relationships between modular forms, particularly Hecke eigenforms, and Galois representations. That's where the modern focus primarily lies. You can see some of the other answers linked from my webpage (here) for more on that.
Let me close this long discussion by just saying that the passage to Galois representations as a focus is a natural development from Kronecker's Jugendtraum, but reflects a shifting of attention from abelian class field theory for quadratic imaginary fields to non-abelian (more precisely, $\mathrm{GL}_2$) class field theory for $\mathbb Q$. (Note that the former embeds in the latter, since the indcution of a Galois character of a quadratic extension gives a two-dimensional rep. of $G_{\mathbb Q}$.)
Finally, let me mention that the main theme of Mazur's article is congruences between cuspforms and Eisenstein series (this is what the Eisenstein ideal measures), and so it's hard to have one without the other. (In some sense, Eisenstein series are like the trivial Dirichlet character mod $N$, while cuspforms are like the non-trivial characters. Which is more important depends on what you are doing; in many problems you need to consider both.)
That's right; 'par' is from 'parallelisation'. The following two quotes from Girard may be illuminative about the significance of parallelisation in linear logic:
(ii) The new connectives of linear logic have obvious meanings in
terms of parallel computation, especially the multiplicatives. In
particular, the multiplicative fragment can be seen as a system of
communication without problems of synchronization. The
synchronization is handled by proof-boxes which are typical of the
additive level. Linear logic is the first attempt to solve the problem
of parallelism at the logical level, i.e., by making the success of
the communication process only dependent of the fact that the programs
can be viewed as proofs of something, and are therefore sound.
[Linear Logic in Theoretical Computer Science 50, 1987, p. 3]
Furthermore, multiplicative connectors and rules can be generalised to make a genuine programming language. [Footnote] Cut elimination is in fact parallel communication between processes.
In this language, logic does not ensure termination, but absence of
deadlock.
[emphases in the original, Proofs and Types, 2003 Web edition, p. 154]
See also Di Cosmo and Miller's article Linear Logic.
Best Answer
⅋ also had me baffled for a long time. The intuition I've arrived at is this: You have both an A and a B, but you can't use them together.
Examples of this (considered as entities in a computer program) include:
(The computer program analogy may not make sense for everybody, of course - but this has been my approach to the topic, and my basis for intuitions.)
You can read the axioms in an up-right-down manner. The axiom for ⅋ introduction,
$$ \frac{ Δ_1,A ⊢ Γ_1 \quad Δ_2,B ⊢ Γ_2 }{ Δ_1,Δ_2, A⅋B \ ⊢ \ Γ_1,Γ_2 } $$
can thus be read as "given a value of type A⅋B, you can divide your remaining resources into two, transform them into new ones, and combine the results." And A⅋B can not be split in any other way - A and B must become separate and cannot take part in the same transformations.