Eigen Vector Method Vs Correlation-Free Coordinates Transformation Method

correlationeigenvalues-eigenvectorsintuition

So, I general, Normed eigen-vector aims to transforms the original problem to uncorrelated state. But, according to the following, figure the transformation to new coordinates system is done using (sines and cosines):

enter image description here

Assuming the equation $ax^2+bxy+cx^2$ with corresponding matrix:

$$\begin{bmatrix}
a & b \\
b & c
\end{bmatrix}$$

Question: How does the two methods for finding correlation free coordinates system related?


Partial Solution:

In normal transformation as per the link for the image, we use $cot(2\theta)=\frac{a-c}{b}$ while in eigen vector process we use determinant of the above matrix by subtracting lambda from principal components as follows:

$$det\begin{bmatrix}
a-\lambda & b \\
b & c-\lambda
\end{bmatrix}$$

I am looking for explanation, preferable in steps, of how the two relates with each other as well as how is it that mere sines and cosines do the same job in fewer steps as the lengthy normed eigen vector finding process does.

Best Answer

By the Spectral Theorem, any real symmetric matrix can be orthogonally diagonalized, namely if $A$ is real symmetric, then there exists an orthogonal matrix $Q$ such that $Q^TAQ=D$, where $Q^T=Q^{-1}$ and $D$ is diagonal matrix.

In the situation here, one can associate a quadratic form $f:=ax^2+bxy+cy^2$ with a symmetric matrix $A$ in the following way: $$f=X^TAX,$$ where $$A=\left[\begin{array}{cc}a&\frac b2\\ \frac b2&c\end{array}\right]~{\rm and~}X=\left[\begin{array}{c}x\\y\end{array}\right].$$

Note that in the Spectral Theorem above, $Q$ consists of a basis of (column) eigenvectors of unit length, and $D$ is an diagonal matrix consisting of the eigenvalues of $A$ compatible with the eigenvectors. For example if $A$ has eigenvalues $\lambda_1\geq \lambda_2$, with associated unit eigenvectors $X_1,X_2$, then a common choice for $Q$ is $Q=[X_1,X_2]$ such that $\det Q=1$ (i.e. $X_2$ is obtained from $X_1$ by turning $90^\circ$ counterclockwise). Bearing this in mind, one has then $A=QDQ^T$ and $$f=X^TQDQ^TX=(Q^TX)^TD(Q^TX).$$ Letting $X'=\left[\begin{array}{c}x'\\y'\end{array}\right]=Q^TX$ (and so $X=QX'$) and $D={\rm diag}(\lambda_1,\lambda_2)$, one has then $$f=X'^TDX'=\lambda_1x'^2+\lambda_2y'^2.$$

Now to address the question why one can take $Q$ to be the matrix of the form $$Q=\left[\begin{array}{cc}\cos\theta&-\sin\theta\\ \sin\theta&\cos\theta\end{array}\right]~{\rm with~}\cot(2\theta)=\frac{a-c}b,$$ one just needs to compare the first column vector for the orthogonal matrix $Q$: Since $\det Q=1$, $Q$ is determined by the first column vector (the second one is obtained by a rotation as above) which corresponds to one of the eigenvectors of $A$, say with eigenvalue $$\lambda_1=\frac{(a+c)+\sqrt{(a-c)^2+b^2}}2$$ and the eigenvector needs to satisfies the system $$\left\{\begin{array}{c}(a-\lambda_1)x+\frac b 2=0\\ \frac b 2x+(c-\lambda_1)y=0\end{array}\right..$$ Up to scaling, a solution is of the form $$X_1=k\left[\begin{array}{c}\frac b 2\\ \lambda_1-a\end{array}\right].$$ Comparing this to $$\left[\begin{array}{c}\cos\theta\\ \sin\theta\end{array}\right],$$ one has $$\tan\theta=\frac{\lambda_1-a}{\frac b 2}=\frac{\frac{a+c+\sqrt{(a-c)^2+b^2}}2-a}{\frac b 2}=\frac{c-a+\sqrt{(a-c)^2+b^2}}b,$$ after some algebra.

To finish up proving that $\cot(2\theta)=\frac{a-c}b$, one uses the following trigonometric identities: $$\cot(2\theta)=1/\tan(2\theta),\tan(2\theta)=\frac{2\tan\theta}{1-\tan^2\theta}.$$ It follows that $$\cot(2\theta)=\frac{1-\tan^2\theta}{2\tan\theta}$$ $$=\frac{1-\frac{\left((c-a)^2+\sqrt{(a-c)^2+b^2}\right)^2}{b^2}}{2\cdot\frac{c-a+\sqrt{(a-c)^2+b^2}}b},$$ which can be simplified to $$\frac{a-c}b,$$ as required.

Related Question