[Math] What do cones have to do with quadratics? Why is $2$ special

big-pictureconic sectionsinner-productsintuitionquadratics

I've always been nagged about the two extremely non-obviously related definitions of conic sections (i.e. it seems so mysterious/magical that somehow slices of a cone are related to degree 2 equations in 2 variables). Recently I came across the following pages/videos:

While 3B1B's video makes a lot of sense and is very beautiful from a geometric standpoint, it does not talk about any of the other conics, or discuss the relationship with "degree 2". Moreover, the 2nd 3B1B video I linked and then Bhargava's lecture highlights "degree 2" as something we understand well, compared to higher degrees (reminds me a little bit of Fermat's last theorem and the non-existence of solutions for $n>2$).

So, I suppose my questions are as follows:

  1. Why, from an intuitive standpoint, should we expect cones to be deeply related to zero-sets of degree 2 algebraic equations?

and more generally:

  1. Is there some deep reason why "$2$" is so special? I've often heard the quip that "mathematics is about turning confusing things into linear algebra" because linear algebra is "the only subject mathematicians completely understand"; but it seems we also understand a lot of nice things about quadratics as well — we have the aforementioned relationship with cones, a complete understanding of rational points, and the Pythagorean theorem (oh! and I just thought of quadratic reciprocity). 2 is also special in all sorts of algebraic contexts, as well as being the only possible finite degree extension of $\mathbb R$, leading to in particular $\mathbb C$ being 2-dimensional.

Also interesting to note that many equations in physics are related to $2$ (the second derivative, or inverse square laws), though that may be a stretch. I appreciate any ideas you share!

$$\rule{5cm}{0.4pt}$$

EDIT 3/12/21: was just thinking about variances, and least squares regression. "$2$" is extremely special in these areas: Why square the difference instead of taking the absolute value in standard deviation?, Why is it so cool to square numbers (in terms of finding the standard deviation)?, and the absolutely mindblowing animation of the physical realization of PCA with Hooke's law: Making sense of principal component analysis, eigenvectors & eigenvalues.

In these links I just listed, seems like the most popular (but still not very satisfying to me) answer is that it's convenient (smooth, easy to minimize, variances sum for independent r.v.'s, etc), a fact that may be a symptom of a deeper connection with the Hilbert-space-iness of $L^2$. Also maybe something about how dealing with squares, Pythagoras gives us that minimizing reconstruction error is the same as maximizing projection variance in PCA. Honorable mentions to Qiaochu Yuan's answer about rotation invariance, and Aaron Meyerowitz's answer about the arithmetic mean being the unique minimizer of sum of squared distances from a given point. As for the incredible alignment with our intuition in the form of the animation with springs and Hooke's law that I linked, I suppose I'll chalk that one up to coincidence, or some sort of SF 😉

$$\rule{5cm}{0.4pt}$$

EDIT 2/11/22:
I was thinking about Hilbert spaces, and then wondering again why they behave so nice, namely they have the closest point lemma (leading to orthogonal decomposition $\mathcal H = \mathcal M \oplus \mathcal M^\perp$ for closed subspaces $\cal M$), or orthonormal bases (leading to Parseval's identity, convergence of a series of orthogonal elements if and only if the sum of the squared lengths converge), and I came to the conclusion that the key result each time seemed to be the Pythagorean theorem (e.g. the parallelogram law is an easy corollary of Pythag). So that begs the questions, why is the Pythagorean theorem so special? The linked article in the accepted answer of this question: What does the Pythagorean Theorem really prove? tells us essentially the Pythagorean theorem boils down to the fact that right triangles can be subdivided into two triangles both similar to the original.

The fact that this subdivision is reached by projecting the vertex onto the hypotenuse (projection deeply related to inner products) is likely also significant… ahh, indeed by the "commutativity of projection", projecting a leg onto the hypotenuse is the same as projecting the hypotenuse onto the leg, but by orthogonality of the legs, the projection of the hypotenuse onto the leg is simply the leg itself! The square comes from the fact that projection scales proportionally to the scaling of each vector, and there are two vectors involved in the operation of projection.

I suppose this sort of "algebraic understanding" of the projection explains the importance of "2" more than the geometry, since just knowing about the "self-similarity of the subdivisions" of the right triangle, one then has to wonder why say tetrahedrons or other shapes in other dimensions don't have this "self-similarity of the subdivisions" property. However it is still not clear to me why projection seems to be so fundamentally "2-dimensional". Perhaps 1-dimensionally, there is the "objective" perception of the vector, and 2-dimensionally there is the "subjective" perception of one vector in the eyes of another, and there's just no good 3-dimensional perception for 3 vectors?

There might also be some connection between the importance of projection and the importance of the Riesz representation theorem (all linear "projections" onto a 1-dimensional subspace, i.e. linear functionals, are actually literal projections against a vector in the space).

$$\rule{5cm}{0.4pt}$$

EDIT 2/18/22: again touching on the degree 2 Diophantine equations I mentioned above, a classical example is the number of ways to write $k$ as the sum of $n$ squares $r_n(k)$. There are a number of nice results for this, the most famous being Fermat's 2-square theorem, and Jacobi's 4-square theorem. A key part of this proof was the use of the Poisson summation formula for the Euler/Jacobi theta function $\theta(\tau) := \sum_{n=-\infty}^\infty e^{i \pi n^2 \tau}$, which depends on/is heavily related to the fact that Gaussians are stable under the Fourier transform. I still don't understand intuitively why this is the case (see Intuitively, why is the Gaussian the Fourier transform of itself?), but there seems to be some relation to Holder conjugates and $L^p$ spaces (or in the Gaussian case, connections to $L^2$), since those show up in generalizations to the Hardy uncertainty principle (“completing the square”, again an algebraic nicety of squares, was used in the proof of Hardy, and the Holder conjugates may have to do with the inequality $-x^p + xu \leq u^q$ -— Problem 4.1 in Stein and Shakarchi’s Complex analysis, where the LHS basically comes from computing the Fourier transform of $e^{-x^p}$) Of course why the Gaussian itself appears everywhere is another question altogether: https://mathoverflow.net/questions/40268/why-is-the-gaussian-so-pervasive-in-mathematics.

This (squares leading to decent theory of $r_n(k)$, and squares leading to nice properties of the Gaussian) is probably also connected to the fact that $\int_{\mathbb R} e^{-x^2} d x$ has a nice explicit value, namely $\sqrt \pi$. I tried seeing if there was a connection between this value of $\pi$ and the value of $\pi$ one gets from calculating the area of a circle "shell-by-shell" $\frac 1{N^2} \sum_{k=0}^N r_2(k) \to \pi$, but I couldn't find anything: Gaussian integral using Euler/Jacobi theta function and $r_2(k)$ (number of representations as sum of 2 squares).

Best Answer

A cone itself is a quadratic! Just in three variables rather than two. More precisely, conical surfaces are "degenerate hyperboloids," such as

$$x^2 + y^2 - z^2 = 0.$$

Taking conic sections corresponds to intersecting a cone with a plane $ax + by + cz = d$, which amounts to replacing one of the three variables with a linear combination of the other two plus a constant, which produces a quadratic in two variables. The easiest one to see is that if $z$ is replaced by a constant $r$ then we get a circle $x^2 + y^2 = r^2$ (which is how you can come up with the above equation; a cone is a shape whose slice at $z = \pm r$ is a circle of radius $r$). Similarly if $x$ or $y$ is replaced by a constant we get a hyperbola.

I don't know that I have a complete picture to present about why quadratics are so much easier to understand than cubics and so forth. Maybe the simplest thing to say is that quadratic forms are closely related to square (symmetric) matrices $M$, since they can be written $q(x) = x^T M x$. And we have lots of tools for understanding square matrices, all of which can then be brought to bear to understand quadratic forms, e.g. the spectral theorem. The corresponding objects for cubic forms is a degree $3$ tensor which is harder to analyze.

Maybe a quite silly way to say it is that $2$ is special because it's the smallest positive integer which isn't equal to $1$. So quadratics are the simplest things that aren't linear and so forth.

Related Question