If you look at the explicit formula, then you can get a bound for the error term in the PNT: If
$$
\psi(x) = \sum_{p^k\le x}\log p,
$$
then formula (9) in page 109 of Davenport's book (multiplicative number theory) implies that
$$
\psi(x) = x + \sum_{-T\le \gamma\le T}\frac{x^\rho}{\rho} + O\left(1+\frac{ x\log^2(xT) }{T}\right),
$$
for every $T\ge1$, where the implied constant is completely effective. Now, if we know that all the zeroes of $\zeta$ up to height $T$ lie on the critical line, then this automatically implies that
$$
|\psi(x) - x | \le x^{1/2}\sum_{-T\le \gamma\le T} \frac{1}{\sqrt{1/4+\gamma^2}} + O\left(1+ \frac{ x\log^2(xT)}{T} \right) \ll x^{1/2}\log^2T + \frac{ x\log^2(xT)}{T},
$$
for some effective implied constants. So, in certain ranges of $x$, depending on $T$, you can get very good bounds on the size of $\psi(x)$ and therefore on how many primes there are up to $x$.
I think that in different theories, there is often a "primitive" fact (which is hard to explain further) that lies at the heart of the complication you mention. Let me give examples.
As for the "2 is the oddest prime" credo in number theory, often it boils down to the fact that $\mathbb{Q}$ contains exactly the second roots of unity. Or equivalently, the unit group of $\mathbb{Z}$ is $2$-torsion. I do not know if this can be embedded in a conceptual explanation; maybe it's a fact one has to live with, with ever-occuring consequences.
In the theory of algebraic groups and Lie algebras, e.g. in Chevalley bases and related stuff, the coefficients will be (or have as prime factors) only $2$ or $3$. A consequence is that many integral structures are $p$-integral only for primes $\ge 5$, and this pops up again and again in the theory. See Dietrich Burde's answer for more. I think here an ultimate explanation for this occurrence of $2$ and $3$ is that they appear in the basic combinatorics of root systems. That is the "primitive" fact.
As for the characteristic $2$ exception for quadratic forms, it is the non-equivalence of quadratic and symmetric bilinear forms that causes trouble. This in turn seems to be "primitive", just try to show equivalence and see that you have to invert $2$. And of course one should expect that for something quadratic, the number $2$ plays a special role.
I guess if we were more interested in some tri-linear stuff, or more in things that can be given as $7$-tuples than in pairs, the cases of characteristic $3$ or $7$ would need more attention. So this translates the question into why bilinear things, and pairs, are often natural. (Remark that such a basic thing as multiplication, including Lie brackets and other non-associative stuff, is a bilinear map and thus will have a tendency to need special treatment in characteristic $2$. Same for any duality, pairings etc.)
As for $2$ and $3$ as bad primes for elliptic curves, the story seems to be a little different. The answers by jmc and Joe Silverman suggest the following view: there is a family of objects (abelian varieties) which can be parametrised roughly by certain numbers (dimension), and exceptional patterns are related to this parameter; and because elliptic curves are the ones where the parameter is small, there are small numbers that behave irregular. Now one would think that this is just a high-brow version of Alex Degtyarev's comment. But there is an interesting subtlety: It is not that in the general theory there are numbers different from $2, 3$ that misbehave (while these become nice), but there are more than just them. In other words: Granted that for every single number you might find some monstrosity somewhere in the general theory. But one might find it surprising that there are some numbers that always need care, even in the most specialised, well-behaved cases. For this, I have no better explanation than:
The strong law of small numbers: Small numbers (not necessarily primes) give exceptional patterns. Because naturally, there are so few of them, and they "have to satisfy too much at once". Maybe this is as far as one gets if one seeks after a common pattern between the "primitive" explanations above.
Best Answer
Apparently it is an alternative proof of the infinitude of Carmichael numbers.
The other proof mentioned in the articles ("done by academics 20 years ago") is:
As for the method, unless someone took notes of the lecture at Zhejiang University, I guess it is unpublished at this point (except for the fragments shown in the news).
As a side note, notice two equations shown in one the photos:
$$1. \quad(6n+1)(18n+1)(54n^2+12n+1)$$ $$2. \quad(1764n-139)(2268n+179)(1000188n^2-157752n+6221)$$
This looks very similar to Chernick's result that $(6n+1)(12n+1)(18n+1)$ is a Carmichael number if each of the factors is prime. It is open whether this family of Charmichael numbers is infinite or not.
This is pure speculation, but the non-linear factors in Yu's numbers might make a big difference, since that is not a Dickinson's conjecture-type problem anymore.
Added data for very small values ($n<11$) for the first expression. $C_i$ indicates the $i$-th Carmichael number.
\begin{array}{|c|c|c|c|} \hline \mathrm{n}& \mathrm{eq. 1 } & \\ \hline 1 & 7 \cdot 19\cdot 67 &8911=C_7\\ \hline 2 & 13 \cdot 37\cdot 241 &115921=C_{18}\\ \hline 3 & - &-\\ \hline 4 & - &-\\ \hline 5 & - &-\\ \hline 6 & 37 \cdot 109\cdot 2017& 8134561=C_{93}\\ \hline 7 & 43 \cdot 127\cdot 2731& 14913991=C_{125}\\ \hline 8 & - & -\\ \hline 9 & - & -\\ \hline 10 & 61 \cdot 181\cdot 5521& 60957361=C_{209}\\ \hline \end{array}
Edit. See the comment of Zhiyun Cheng below for a translation of the text in chinese.