[Math] geometric view of similar vs congruent matrices

geometryintuitionlinear algebralinear-transformations

I'm trying to understand similarity $(A \sim B \iff A = SBS^{-1})$ and congruence $(A \cong B \iff A = PBP^T,\, P\in GL(n, \mathbb{R}))$ through geometric analogies.

For triangles:

  • affine transformations (i.e., uniform scaling, rotation, reflection, translation[?]) preserve similarity. Two similar triangles have equivalent angles but may have different side lengths (i.e., they are measured in a different basis)
  • unitary transformations (i.e., rotation, reflection, translation[?]) preserve congruence

For matrices:

  • similarity preserves the idea of a linear transformation; most "geometrically", eigenvalues are preserved, but angles between vectors are not preserved
  • congruence preserves the above DOESN'T preserve the spectrum and but preserves angles between vectors (my interpretation of the bilinear form property) and the number of positive / negative / zero eigenvalues

However, we only require $P$ to be invertible for matrix congruence $A = PBP^T$. Why don't we require $P$ to be unitary (i.e., why do we allow the spectrum to change)? Why is the adjoint $P^T$ important?

Also, is there an analogy for thinking about (1) how angles between vectors are preserved by congruence but not similarity with matrices and (2) how angles between triangle legs are preserved with both similarity and congruence, or am I reading too much into the naming conventions?

Best Answer

I think you mix everything. When we speak about congruence, the matrices do not act on vectors and, consequently, the angles cannot be preserved. The "similar" in "similar triangles" has nothing to do with the "similar" of "similar matrices". When you write "affine transformations (i.e., uniform scaling, rotation, reflection, translation[?])", you are far from the definition of an affine function.

  1. An affine function is in the form $f:x\in\mathbb{R}^n\rightarrow Ax+b\in\mathbb{R}^p$ where $A\in M_{p,n},b\in\mathbb{R}^p$ are fixed.

  2. Similarity of triangles in the plane. It is associated (up to a translation) to a transformation in the form $z\in\mathbb{C}\rightarrow az$ (for direct similarity) or $z\in\mathbb{C}\rightarrow a\overline{z}$ (for inverse similarity) where $a=u+iv$ is a fixed complex. It's a composition of homothety, rotation and, eventually, symmetry. The associated linear application has the form $\begin{pmatrix}u&-v\\v&u\end{pmatrix}$ (in a vector subspace of $M_2$ that is isomorphic to $\mathbb{C}$).

  3. When we speak about similarity of matrices, the matrix $A$ is considered as a linear function and acts on vectors: $y=Ax$. By a change of basis $y=Py',x=Px'$ and $y'=P^{-1}APx$, that is, the new matrix is $P^{-1}AP$.

When we speak about congruence of matrices, the matrix $A$ is considered as a bilinear form and acts on a couple of vectors $(x,y)$: $\phi(x,y)=x^TAy$. By a change of basis $y=Py',x=Px'$ and $x^TAy=x'^TP^TAPy'$, that is, the new matrix is $P^TAP$.

Note that, if we use the standard inner product $<.>$, $\phi(x,y)=<Ay,x>=<y,A^Tx>$, that is the definition of the adjoint $A^T$.

  1. The orthogonal matrices $U$ bridge the two previous notions because $U^T=U^{-1}$: by an orthonormal change of basis, a matrix $A$ can be considered as the matrix of a bilinear form or as the matrix of a linear function.

EDIT. Answer to jjjjjj . Let $T_1,T_2$ be two triangles. According to the standard definitions,

$T_1,T_2$ are similar iff $f(T_1)=T_2$ where $f $ is affine (cf. 1.) with $A=\lambda U$ where $\lambda>0$ and $U$ is orthogonal.

$T_1,T_2$ are congruent iff $f(T_1)=T_2$ where $f $ is affine (cf. 1.) with $A=U$ orthogonal.

Note that if $\det(U)=1$, then the angles are preserved and if $\det(U)=-1$, then the angles are transformed in their opposite.

Assume that $T_1,T_2$ are congruent and placed on a sheet. If $\det(U)=1$, then we can drag $T_1$ onto $T_2$. If $\det(U)=-1$, then we flip $T_1$, after that, we drag it onto $T_2$.