[Math] Understanding Sylvester’s theorems and change of coordinates

coordinate systemslinear algebramatricesquadratic-forms

(Sylvester's Theorem). Any quadratic form $q$ over $\mathbb{R}$ with matrix $A$ has the form \begin{equation}q({\bf{v}}) = \sum_{i=1}^t {x^2 _i} – \sum_{i=1}^u x^2_{t+i}\end{equation} with respect to a suitable basis, where $t + u = \text{rank}(A)$.

Equivalently, given a symmetric matrix $A \in \mathbb{R}^{n \times n}$, there is an invertible matrix $P \in \mathbb{R}^{n \times n}$ such that $P^TAP = B$, where $D = (\alpha_{ij})$
is a diagonal matrix with $\alpha_{ii}
= 1$ for
$1 \leq i \leq t$, $\alpha_{ii} = -1$ for $t + 1 \leq i \leq t + u$, and $\alpha_{ii} = 0$ for $t + u + 1 \leq i \leq n$, and
$t + u = \text{rank}(A)$.

I am trying to understand the above and why I would want to perform this coordinate change when I can diagonalise it instead. I tried an example with $q(x,y) = -x^2 + 6xy -9y^2$ :

$$A = \begin{pmatrix} -1 & 3 \\ 3 & -9 \end{pmatrix}, \lambda_1 = -10, v_1 = \begin{pmatrix}\frac{1}{\sqrt{10}} \\ \frac{-3}{\sqrt{10}}\end{pmatrix}, \lambda_2 = 0, v_2 = \begin{pmatrix}\frac{3}{\sqrt{10}} \\ \frac{1}{\sqrt{10}}\end{pmatrix}$$

$$S = \begin{pmatrix} \frac{1}{\sqrt{10}} & \frac{3}{\sqrt{10}} \\ \frac{-3}{\sqrt{10}} & \frac{1}{\sqrt{10}} \end{pmatrix}, S^TAS = \begin{pmatrix} -10 & 0 \\ 0 & 0 \end{pmatrix} = \Lambda$$

Correct me if I am wrong, but from here I believe the normal "nice" change of coordinates ignoring the above theorem is defining:

$$q(x,y) = {\bf{x}}^TA{\bf{x}} = {\bf{x}}^TS\Lambda S^T{\bf{x}} := {\bf{x'}}^T\Lambda{\bf{x'}} = r(x',y')$$

And because $S$ is an orthogonal matrix this means the quadratic form is preserved under the new $(x',y')$ coordinates.

However if we write for Sylvester's theorem:

$$Q = \begin{pmatrix} \sqrt{10} & 0 \\ 0 & 0 \end{pmatrix}, D = \begin{pmatrix} -1 & 0 \\ 0 & 0 \end{pmatrix}$$

To factorise $\Lambda$ as $\Lambda = QDQ^T \iff A = SQD(SQ)^T$ and defining $P:= S{(Q^{-1})}^T = SQ^{-1}$ verifies there does indeed exist an invertible (but not orthogonal!) $P$ s.t. $P^T A P = D$. We also have by the theorem there must be a coordinate change for $q({\bf{v}}) = -x^2$

The new coordinates $(x'', y'')$ defined by ${\bf{x}}^TSQD (SQ)^T{\bf{x}} := {\bf{x''}}^TD{\bf{x''}}$ I computed as $(x-3y, 0)$ which gives $q(x,y)$ multiplying out the matrix vector product.

But the matrix $SQ$ is not orthogonal, $SQ(SQ)^T = SQQ^T S^T = SQ^2 S^{-1} \neq I$ which must mean the transformation deforms whatever geometric shape $q$ represents, is there some sort of geometric meaning to this or application? Sylvester's Law of Inertia which I am told is also an important result tells me that the $t$ and $u$ amounts of $\pm 1$ are invariant under any transformation in the form described in the box above, which implies there are many such transformations, how else would I find these?

Best Answer

"Why would I perform this coordinate change when I can diagonalize instead?"

Sometimes we don't want to actually perform the coordinate change; we just want to know that such a change exists. As @WillJagy points out, Sylvester's law says that not only is there such a coordinate change, but the number of $+1$ entries (sometimes called the "signature") is the same for any such coordinate change.

Here's my favorite application:

Take a smooth compact surface without boundary $S$ in 3-space. Consider the function $$ f_v: S \to \Bbb R : (x, y, z) \mapsto (x, y, z) \cdot v $$ For almost every unit vector $v$, this will be a smooth function. To simplify, let's suppose that $v = (0,0,1)$ works, so that $f(x, y, z) = z$.

Now the second derivative of $f$ at each point $P = (x, y, z) \in S$ is a symmetric bilinear form on the tangent plane to $S$ at $P$. If we look at each critical point $Q$ of $f$ (for a sphere, that'd be "the south pole" and "the north pole"), this signature tells you whether the surface 'bends down' at $Q$ (as at the north pole) or "bends up" at $Q$ (as in the south pole, or "bends both ways" (as in the two middle critical points for a bagel that's balanced vertically on a table). It doesn't matter how much the surface bends up or down, hence I don't care about the eigenvalues; all that matters is that in every direction at the south pole, it's bending up. And the signature tells me that.

The cool theorem? The sum (over all critical points) of $(-1)^{\sigma(Q)}$, where $\sigma(Q)$ is the signature, is the same as the Euler characteristic of the surface.

This is all laid out in detail (along with the technical hypotheses required for it to work, like "critical points of $f$ must be isolated") in the first chapter of Milnor's Morse Theory.

Related Question