Intuition and Meaning Behind Quadratic Forms

linear algebraquadratic-forms

My professor just covered quadratic forms, but unfortunately did not give any intuition behind their meaning, so I'm hoping to get some of that cleared up.

I know that we define a quadratic form as $Q(x) = x^T Ax$ , for some symmetric (i.e orthogonally diagonalizable) matrix $A$. Is there some significance to this other than that it is a "cool" transformation from $R^n \to R$ ? Is it special in some way?

He also spoke about the Principal Axis Theorem. After looking on Wikipedia, it seems that the PAT that he described is wildly different from what most of the internet says.

The professor said that the PAT tells us that any quadratic form $Q(x)$ can be "transformed" (what does that even mean???) into the quadratic form $Q(y) = y^T Dy$ with no cross product term (the cross product term is defined as the $x_1\cdot x_2$ term in the quadratic form), where $D$ is a diagonal matrix.

His proof used the fact that $Q(x) = x^T Ax = (x^T Q)D(Q^T x) = y^T Dy$ for some $y$.
What does the "transformed quadratic form" represent? Why is it significant? All my professor did was define these things, and didn't explain any intuition.

Best Answer

Quadratic forms are great! They are related to some pretty great stuff such as bilinear forms and the Arf invariant. Quadratic forms in general encode the so-called "quadric surfaces" such as ellipses, hyperbolic paraboloids, and so on. The principal axis theorem, also known as the spectral theorem, is one of the most important theorems in linear algebra! It is what allows us to "transform" the quadratic forms your professor mentioned.

Take a quadratic form $q: \Bbb R^n \to \Bbb R$ defined by $x \mapsto x^tAx$. Since $A$ is symmetric (or can be made symmetric pretty easily), the principal axis theorem says we may orthogonally diagonalize it! This is what eliminates any of the cross-terms such as $x_1x_2$. Going through with the orthogonal diagonalization, $x^tAx = x^tQDQ^tx = (Q^tx)^tD(Q^tx) = y^tDy$. This matrix $D$ is diagonal, and its diagonal entries are the eigenvalues of $A$.

The significance of this "transformed" quadratic form is that it is more meaningful in terms of the information it encodes. Without those pesky cross-terms, we can see exactly what the quadric surface is without the fluff. The easiest surfaces to identify are those of the form $a_1y_1^2 + a_2y_2^2 + a_3y_3^2$ since the signs of $a_1, a_2$ and $a_3$ are how we distinguish between ellipsoid, paraboloid, etc.

They are also of great use in physics when we are dealing with the inertia tensor of a rigid body. They are about one of the coolest things we learn about in first-year linear algebra!

Edit: Check out these notes by professor Mike Hopkins at Harvard about quadratic and bilinear forms. Professor Hopkins gave a really good lecture at Northwestern this past May in which he discussed some of the more high-level aspects of quadratic forms and how they connect to the Arf invariant. His lecture and these notes are accessible to anyone taking a linear algebra course. These notes in particular should help you to make some "aha!" moments and deeper connections/intuitions about quadratic forms.

http://math.harvard.edu/~mjh/northwestern.pdf

To add to amd's comment, given a $C^2$, real-valued function $f$ of $n$ variables, and a critical point $x_0$ of the function, we can Taylor expand $f$ to second-order to discern the nature of the critical value. That is,

$$ f(x) = f(x_0) + \tfrac{1}{2}x^tHx + o(\Vert{x}\Vert^2), $$

where $H$ is the Hessian of $f$, and it encodes all second-order partials of $f$ at the point $x_0 \in \Bbb R^n$. Since $f$ is $C^2$, the Hessian of $f$ is symmetric, and we may orthogonally diagonalize $H$ (this is "transforming" the quadratic form via a change of variables):

$$ f(y) = f(y_0) + \tfrac{1}{2}y^tDy + o(\Vert{y}\Vert^2). $$

From $D$, we can pick off right away whether $x_0$ (equivalently, $y_0$) is a local max, min, or neither since the entries of $D$ along its diagonal are the eigenvalues of the Hessian. If $D$ has strictly positive eigenvalues, then $x_0$ is a minimum (think concave up), and if $D$ has strictly negative eigenvalues, then $x_0$ is a maximum (think concave down). If $D$ has both positive and negative eigenvalues, $x_0$ is a saddle point.

In short, this makes the classification of extrema simpler, thanks to the fact that the second-order term in the Taylor expansion of $f$ about a critical point is itself a quadratic form.

Related Question