[Math] Symmetric matrix decomposition with orthonormal basis of non-eigenvectors

eigenvalues-eigenvectorslinear algebramatricesstatistics

I like to understand the following transformation found in documentation for deriving Kalman filter.

Abstract Formulation: Given 2 symmetric matrices $A$ ,$B$ $\in$ $\mathbb R^{3,3}$ with $A \ne B$ and a set of orthonormal Eigenvectors ($u_1$, $u_2$, $u_3$) from some other matrix $B$ (not $A$!). Because the matrices are symmetric it is clear that $B$ can be decomposed to $B = U\Lambda U^t$.
Now there is stated that A can be written as:

$A$ = ($u_1^t$A$u_1$)$u_1$$u_1^t$ + ($u_2^t$A$u_2$)$u_2$$u_2^t$ + ($u_3^t$A$u_3$)$u_3$$u_3^t$
i.e. with the "foreign" Eigenvectors.

Concrete Situation: In the original equation the above mentioned $A$ is defined as $H_kP_k^-H_k^t$ + $R_a$, where $P^-$ is the a priori estimation error covariance and $R_a$ is the sensor noise error covariance matrix. $H_k$ has 3×9 dimension and contains some "more abstract" content with rotation matrix of a quaternion multiplied with cross product operator of gravity vector (0,0,g). As far as I can see, the term ($H_kP_k^-H_k^t$ + $R_a$) does not lead to a diagonal matrix and this seems to be irrelevant. What I called $B$ is actually the signal's overall error error covariance named $U_k$

From original paper:

From paper

Because $U_k$ covariance matrix cannot be obtained at this point in time (a priori estimation), it is approximated by the average of the last M steps i.e. from k-M to k-1. The signal itself might be fluctuating considerably because sometimes there is external acceleration at other times there isn't thus sensor noise is the only thing be measured.

Assumption (thanks to Calle's and joriki's comments): The eigendecomposition of $U_k$ is related to PCA Principal component analysis (an easier one here). The most interesting cases are all those measurements with strong accelerations i.e. $U_k$ is much greater than the remaining term. So this decomposition of the 2nd term transforms it (approximatively?) towards the direction of the strongest signal. Thus $\lambda – \mu$ helps to detect these situations respectively distinguish them from phases with no signal aside from noise.

  • Does this explanation makes sense?
  • Can this procedure of approximating with "wrong" eigenvectors and -values be applied and compares like with like?
  • What is the name of this matrix
    decomposition
    taking not their
    own eigenvectors?
  • What is about the error?

Thanks for helping

Kay

PS: Title changed from "Symmetric matrix multiplied with kind of orthonormal basis"

Best Answer

Writing a matrix $A$ in terms of a basis that does not diagonalize the matrix $A$ is possible, but it requires a full expansion of all terms, not just the diagonal terms. (If the basis diagonalizes $A$ then all off diagonal terms would be zero).

If $A$ is a $3 \times 3$ matrix and you write $ A = U U^{-1} A UU^{-1}$ with $$U = \begin{bmatrix}\mathbf{u}_1 & \mathbf{u}_2 & \mathbf{u}_3\end{bmatrix}$$ then you have the sum along all pairs $ij$ (not just for $i=j$) $$ A = \sum_{ij}\mathbf{u}_i ( \mathbf{u}_i^\top A \mathbf{u}_j)\mathbf{u}_j^\top$$

I can not tell you why the off diagonal terms were ignored, perhaps it is by design or perhaps by mistake.

Here is the full expansion:

\begin{align} U^{-1}AU &=\begin{bmatrix}\mathbf{u}_1^\top \\ \mathbf{u}_2^\top \\ \mathbf{u}_3^\top \end{bmatrix} A\begin{bmatrix}\mathbf{u}_1 & \mathbf{u}_2 & \mathbf{u}_3\end{bmatrix} \\ &=\begin{bmatrix}\mathbf{u}_1^\top A \\ \mathbf{u}_2^\top A \\ \mathbf{u}_3^\top A\end{bmatrix} \begin{bmatrix}\mathbf{u}_1 & \mathbf{u}_2 & \mathbf{u}_3\end{bmatrix} \\ &= \begin{bmatrix}\mathbf{u}_1^\top A \mathbf{u}_1 & \mathbf{u}_1^\top A \mathbf{u}_2 & \mathbf{u}_1^\top A \mathbf{u}_3 \\ \mathbf{u}_2^\top A \mathbf{u}_1 & \mathbf{u}_2^\top A \mathbf{u}_2 & \mathbf{u}_2^\top A \mathbf{u}_3 \\\mathbf{u}_3^\top A \mathbf{u}_1 & \mathbf{u}_3^\top A \mathbf{u}_3 & \mathbf{u}_3^\top A \mathbf{u}_3 \\\end{bmatrix} \\ UU^{-1}AUU^{-1} &= \begin{bmatrix}\mathbf{u}_1 & \mathbf{u}_2 & \mathbf{u}_3\end{bmatrix}\begin{bmatrix}\mathbf{u}_1^\top A \mathbf{u}_1 & \mathbf{u}_1^\top A \mathbf{u}_2 & \mathbf{u}_1^\top A \mathbf{u}_3 \\ \mathbf{u}_2^\top A \mathbf{u}_1 & \mathbf{u}_2^\top A \mathbf{u}_2 & \mathbf{u}_2^\top A \mathbf{u}_3 \\\mathbf{u}_3^\top A \mathbf{u}_1 & \mathbf{u}_3^\top A \mathbf{u}_2 & \mathbf{u}_3^\top A \mathbf{u}_3 \\\end{bmatrix}\begin{bmatrix}\mathbf{u}_1^\top \\ \mathbf{u}_2^\top \\ \mathbf{u}_3^\top\end{bmatrix} \\ \hphantom{A} \\ A &= \begin{bmatrix}\mathbf{u}_1 & \mathbf{u}_2 & \mathbf{u}_3\end{bmatrix}\begin{bmatrix}\mathbf{u}_1^\top A \mathbf{u}_1 \mathbf{u}_1^\top + \mathbf{u}_1^\top A \mathbf{u}_2 \mathbf{u}_2^\top + \mathbf{u}_1^\top A \mathbf{u}_3\mathbf{u}_3^\top \\ \mathbf{u}_2^\top A \mathbf{u}_1\mathbf{u}_1^\top + \mathbf{u}_2^\top A \mathbf{u}_2\mathbf{u}_2^\top + \mathbf{u}_2^\top A \mathbf{u}_3\mathbf{u}_3^\top \\\mathbf{u}_3^\top A \mathbf{u}_1\mathbf{u}_1^\top + \mathbf{u}_3^\top A \mathbf{u}_2\mathbf{u}_2^\top + \mathbf{u}_3^\top A \mathbf{u}_3\mathbf{u}_3^\top \\\end{bmatrix} \\ &= \sum_{ij}\mathbf{u}_i ( \mathbf{u}_i^\top A \mathbf{u}_j)\mathbf{u}_j^\top \end{align}