I’m not convinced this is MO-appropriate, but I’m posting an answer ’cause what I’d have to say is probably too long for a comment.
Expanding on Reid’s comment. Yeah, Lucas’ theorem is nice. Lucas’ theorem is one of a fair number of combinatorial results which can be thought of as “first steps towards p-adic numbers.” What’s that mean? There’s a “different” absolute value that you can define on rational numbers, which has a lot of the same properties as the usual absolute value, but in other ways behaves totally differently. Actually there are an infinite number of these guys, one for every prime $p$! It’s called the p-adic absolute value, and you can read about it on Wikipedia.
What the p-adic numbers do is help you to get around the following obstacle: Say you want to tell whether a quotient of two numbers, $\frac{a}{b}$, is divisible by $p$. (We’ll assume for now that $\frac{a}{b}$ is definitely an integer, although this ends up not mattering at all. But “divisibility” is a trickier notion for non-integers.) If $a$ is divisible by $p$ and $b$ isn’t, then it’s obvious that $\frac{a}{b}$ is; if $a$ isn’t divisible by $p$, then of course $\frac{a}{b}$ isn’t. But things get tough if both $a$ and $b$ are divisible by $p$; it could happen that $a$ is divisible by $p^2$, and $b$ is divisible by $p$ but not by $p^2$. Or that $a$ is divisible by $p^{17}$, and $b$ is divisible by $p^{14}$ but not by $p^{15}$. You see how this gets confusing! The p-adic absolute value encodes this sort of information for you.
This also explains why we don’t work with, say, 10-adic numbers in mathematics; it’s because if you take the integers but consider two integers to be the same if they have the same remainder when you divide by $n$, you can still multiply and add and subtract perfectly well. So you get something called a ring. And if $n$ is prime, you can also divide numbers! (Well, you can’t divide by 0, or by a multiple of the prime, which is “the same as” 0. But this is true no matter what, so it’s not a real problem.) But this isn’t true for composites.
Anyway, the patterns for primes in Pascal’s triangle are pretty well known. Google “Pascal’s triangle modulo” (without quotes, probably) to find more stuff. Composites don’t behave as nicely, for the reasons Wikipedia and I both briefly mentioned, but powers of primes do have interesting patterns, which you can read about in the wonderfully-titled paper “Zaphod Beeblebrox’s Brain and the Fifty-ninth Row of Pascal’s Triangle”.
Hope this helps!
Well, first of all, $\pi$ is not just a random real number. Almost every real number is transcendental so how can we make the notion "$\pi$ is special" (in a number-theoretical sense) more precise?
Start by noticing that $$\pi=\int_{-\infty}^{\infty}\frac{dx}{1+x^2}$$
This already tells us that $\pi$ has something to do with rational numbers. It can be expressed as "a complex number whose real and imaginary parts are values of absolutely convergent integrals of rational functions with rational coefficients, over domains in $\mathbb{R}^n$ given by polynomial inequalities with rational coefficients." Such numbers are called periods.
Coming back to the identity
$$\zeta(2)=\frac{\pi^2}{6}$$
There is a very nice proof of this (that at first seems very unnatural) due to Calabi. It shows that
$$\frac{3\zeta(2)}{4}=\int_0^1\int_0^1\frac{dx\,dy}{1-x^2y^2}$$
by expanding the corresponding geometric series, and then evaluates the integral to $\pi^2/8$. (So yes, $\pi^2$ and all other powers of $\pi$ are periods.) But the story doesn't end here as it is believed that there are truly deep connections between values of zeta functions (or L-functions) and certain evaluations involving periods, such as $\pi$. Another famous problem about primes is Sylvester's problem of which primes can be written as a sum of two rational cubes. So one studies the elliptic curve
$$E_p: p=x^3+y^3$$ and one wants to know if there is one rational solution, the central value of the corresponding L-function will again involve $\pi$ up to some integer factor and some Gamma factor. Next, periods are also values of multiple zeta functions:
$$\zeta(s_1,s_2,\dots,s_k)=\sum_{n_1>n_2>\cdots>n_k\geq 1}\frac{1}{n_1^{s_1}\cdots n_k^{s_k}}$$
And they also appear in other very important conjectures such as the Birch and Swinnerton-Dyer conjecture. But of course all of this is really hard to explain without using appropriate terminology, the language of motives etc. So, though, this answer doesn't mean much, it's trying to show that there is an answer to your question out there, and if you study a lot of modern number theory, it might just be satisfactory :-).
Best Answer
I think that in different theories, there is often a "primitive" fact (which is hard to explain further) that lies at the heart of the complication you mention. Let me give examples.
As for the "2 is the oddest prime" credo in number theory, often it boils down to the fact that $\mathbb{Q}$ contains exactly the second roots of unity. Or equivalently, the unit group of $\mathbb{Z}$ is $2$-torsion. I do not know if this can be embedded in a conceptual explanation; maybe it's a fact one has to live with, with ever-occuring consequences.
In the theory of algebraic groups and Lie algebras, e.g. in Chevalley bases and related stuff, the coefficients will be (or have as prime factors) only $2$ or $3$. A consequence is that many integral structures are $p$-integral only for primes $\ge 5$, and this pops up again and again in the theory. See Dietrich Burde's answer for more. I think here an ultimate explanation for this occurrence of $2$ and $3$ is that they appear in the basic combinatorics of root systems. That is the "primitive" fact.
As for the characteristic $2$ exception for quadratic forms, it is the non-equivalence of quadratic and symmetric bilinear forms that causes trouble. This in turn seems to be "primitive", just try to show equivalence and see that you have to invert $2$. And of course one should expect that for something quadratic, the number $2$ plays a special role.
I guess if we were more interested in some tri-linear stuff, or more in things that can be given as $7$-tuples than in pairs, the cases of characteristic $3$ or $7$ would need more attention. So this translates the question into why bilinear things, and pairs, are often natural. (Remark that such a basic thing as multiplication, including Lie brackets and other non-associative stuff, is a bilinear map and thus will have a tendency to need special treatment in characteristic $2$. Same for any duality, pairings etc.)
As for $2$ and $3$ as bad primes for elliptic curves, the story seems to be a little different. The answers by jmc and Joe Silverman suggest the following view: there is a family of objects (abelian varieties) which can be parametrised roughly by certain numbers (dimension), and exceptional patterns are related to this parameter; and because elliptic curves are the ones where the parameter is small, there are small numbers that behave irregular. Now one would think that this is just a high-brow version of Alex Degtyarev's comment. But there is an interesting subtlety: It is not that in the general theory there are numbers different from $2, 3$ that misbehave (while these become nice), but there are more than just them. In other words: Granted that for every single number you might find some monstrosity somewhere in the general theory. But one might find it surprising that there are some numbers that always need care, even in the most specialised, well-behaved cases. For this, I have no better explanation than:
The strong law of small numbers: Small numbers (not necessarily primes) give exceptional patterns. Because naturally, there are so few of them, and they "have to satisfy too much at once". Maybe this is as far as one gets if one seeks after a common pattern between the "primitive" explanations above.