[Math] How to compute solution of a non-square matrix by QR Decomposition and Cholesky Factorization

linear algebralinear-transformationsmatrices

$$
A = \begin{matrix}
10000 & 10001 & \\
10001 & 10002 & \\
10002 & 10003 & \\
10003 & 10004 & \\
10004 & 10005 & \\
\end{matrix}
$$

$$
b = \begin{matrix}
20001 & \\
20003 & \\
20005 & \\
20007 & \\
20009 & \\
\end{matrix}
$$
I want to find QR Decomposition of Ax = b by using Householder Transformation and also compute the solution using the Cholesky factorization. Could you help me ?

Best Answer

For the Householder approach, we will refer to these notes and for Cholesky, these notes.

$$A = \left( \begin{array}{cc} 10000. & 10001. \\ 10001. & 10002. \\ 10002. & 10003. \\ 10003. & 10004. \\ 10004. & 10005. \\ \end{array} \right), ~~b = \begin {pmatrix} 20001 \\20003\\20005\\20007\\20009\end{pmatrix}$$

$A_1 =A$

$x_1=\left( \begin{array}{c} 10000. \\ 10001. \\ 10002. \\ 10003. \\ 10004. \\ \end{array} \right)$

$n_1 = ||x_1||_2 = 22365.2$

$e_1=\left( \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \\ 0 \\ \end{array} \right)$

$v_1 = x_1 + n_1 e_1 = \left( \begin{array}{c} 32365.2 \\ 10001. \\ 10002. \\ 10003. \\ 10004. \\ \end{array} \right)$

$c_1 = \dfrac{2}{v_1^T.v_1} = 1.381498731530902 \times 10^{-9}$

$h_1 = I - c_1 v_1 . v_1^T = \left( \begin{array}{ccccc} -0.447124 & -0.447169 & -0.447214 & -0.447258 & -0.447303 \\ -0.447169 & 0.861822 & -0.138191 & -0.138205 & -0.138219 \\ -0.447214 & -0.138191 & 0.861795 & -0.138219 & -0.138233 \\ -0.447258 & -0.138205 & -0.138219 & 0.861767 & -0.138247 \\ -0.447303 & -0.138219 & -0.138233 & -0.138247 & 0.86174 \\ \end{array} \right)$

$A_2 = h_1.A_1 = \left( \begin{array}{cc} -22365.2 & -22367.4 \\ 0 & 0.0000382051 \\ 0 & -0.000061781 \\ 0 & -0.000161767 \\ 0 & -0.000261753 \\ \end{array} \right)$

$x_2 = \left( \begin{array}{c} 0.0000382051 \\ -0.000061781 \\ -0.000161767 \\ -0.000261753 \\ \end{array} \right)$

$n_2 = ||x_2||_2 = 0.000316165$

$e_2=\left( \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \\ \end{array} \right)$

$v_2 = x_2 - n_2 e_2 = \left( \begin{array}{c} -0.000277959 \\ -0.000061781 \\ -0.000161767 \\ -0.000261753 \\ \end{array} \right)$

$c_2 = \dfrac{2}{v_2^T.v_2} = 1.1379 \times 10^7$

$h_2 = I - c_2 v_2 . v_2^T = \left( \begin{array}{cccc} 0.120839 & -0.195408 & -0.511655 & -0.827902 \\ -0.195408 & 0.956567 & -0.113724 & -0.184015 \\ -0.511655 & -0.113724 & 0.702226 & -0.481824 \\ -0.827902 & -0.184015 & -0.481824 & 0.220367 \\ \end{array} \right)$

$A_3 = h_2.\left( \begin{array}{c} 0.0000382051 \\ -0.000061781 \\ -0.000161767 \\ -0.000261753 \\ \end{array} \right) = \left( \begin{array}{c} 0.000316165 \\ 0 \\ 0 \\ 0 \\ \end{array} \right)$

Following the notes, we have $A = QR = h_1.h_2.A_3$, which is

$\left( \begin{array}{ccccc} -0.447124 & 0.632519 & -0.207235 & 0.1811 & 0.569435 \\ -0.447169 & 0.316291 & -0.259445 & -0.455694 & -0.651944 \\ -0.447214 & 0.0000632303 & 0.892524 & -0.0577574 & -0.00803894 \\ -0.447258 & -0.316165 & -0.177773 & 0.758198 & -0.30583 \\ -0.447303 & -0.632392 & -0.248071 & -0.425846 & 0.396378 \\ \end{array} \right).\left( \begin{array}{cc} -22365.2 & -22367.4 \\ 0 & 0.000316165 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ \end{array} \right)$

Using back substitution, $Rx = Q^T b$ gives

$$x = \begin{pmatrix} 1 \\ 1 \end{pmatrix}$$

For Cholesky, we have

$$A_C = A^T.A = \left( \begin{array}{cc} 500200030 & 500250040 \\ 500250040 & 500300055 \\ \end{array} \right)$$

$L_{11} = a_{11}^{1/2} = 22365.2, L_{21} = \dfrac{a_{21}}{\sqrt{a_{11}}} = 22367.4 , L_{22} = \sqrt{a_{22}-L_{21}^2}=0.000422864$

Now, we are ready to solve

$$Ax = b \implies LL^T x = b \implies Lc = b, L^Tx = c$$

of course, we arrive at the same result.

Related Solutions

[Math] Modified Cholesky factorization and retrieving the usual LT matrix

Here is an example of how the effect of columnwise rotation occurs. See the protocol using my MatMate-program. The "rot"-command performs a column-wise rotation to triangular shape, where the order of the invovled rows (the rotation-citeria) is given as list, and the triangular form is so just permuted. However, the permutation includes also a re-computation of the diagonal (and the other entries, too)

;===========================================
;MatMate-Listing vom:06.12.2011 19:40:03
;============================================
[1] A =  mk(4,4,6,3,4,8, 3,6,5,1, 4,5,10,7, 8,1,7, 25)
[2] L = cholesky(A)
      A : 
    6.00        3.00        4.00        8.00
    3.00        6.00        5.00        1.00
    4.00        5.00       10.00        7.00
    8.00        1.00        7.00       25.00

      L : 
    2.45        0.00        0.00        0.00
    1.22        2.12        0.00        0.00
    1.63        1.41        2.31        0.00
    3.27       -1.41        1.59        3.13

[3] M = rot(L,"drei",1´2´3´4)

Command [3] does not change anything because the order given is the default order. In

[4] M = rot(L,"drei",2´3´4´1)
      M : 
    1.22        0.62        1.38       -1.48
    2.45        0.00        0.00        0.00
    2.04        2.42       -0.00        0.00
    0.41        2.55        4.28       -0.00

[5] M = rot(L,"drei",3´4´1´2)
      M : 
    1.26        1.16        1.75        0.00
    1.58       -0.56        0.94        1.52
    3.16       -0.00       -0.00       -0.00
    2.21        4.48        0.00       -0.00

in command [6] we find the solution with the number 5 in the first column:

[6] M = rot(L,"drei",4´1´2´3)
      M : 
    1.60        1.85       -0.00       -0.00
    0.20        1.44        1.97        0.00
    1.40        0.95        1.70       -2.06
    5.00        0.00       -0.00       -0.00

After that I found the "modified cholesky" solution by the rotation in the following order:

[7] M = rot(L,"drei",4´3´2´1)
      M : 
    1.60        0.62        0.92        1.48
    0.20        1.66        1.79        0.00
    1.40        2.84        0.00       -0.00
    5.00        0.00        0.00        0.00

only that the rows are permuted.

So we can even systematically recover the original vanilla-cholesky matrix from that of the modified-cholesky-procedure: if we have r rows, then we need at most r(r-1)/2 rotations to triangular shape to recover the original vanilla-cholesky matrix.

[update 2] A remark about "rotation of columns". Rotation of columns can be thought as postmultiplication of the cholesky L by some rotation-matrix T. T itself is generated by successively rotating pairs of columns, so T can be seen as product of elementary rotation-matrices $\small t_{k,j}=\begin{array} {rrrr} \cos(\varphi_{k,j}) & \sin(\varphi_{k,j}) \\ -\sin(\varphi_{k,j}) & \cos(\varphi_{k,j}) \end{array} $ where that k,j'th rotation-parameters are inserted in the ID-matrix at the appropriate pair of rows k and j and the same columns. Then $\small T = t_{1,2} \cdot t_{1,3} \cdot t_{1,4} \cdot \ldots t_{2,3} \cdot t_{2,4} \cdot \ldots t_{3,4} \cdot \ldots \cdot t_{n-1,n} $ where n is the number of columns of the matrix which is to be rotated. In particular, T is not a permuationmatrix!
The different rotations mentioned in the example above occur, because for the triangular rotation we need to find a rotationmatrix $\small T_1 $ which rotates the matrix L such that the entries in row 1 are collected in column 1, call this version of L $\small L_1$. Next we need to find the rotationmatrix $\small T_2 $ which rotates then $\small L_1$ such that the entries in row 2 are collected in column 2 but where we do not touch the column 1. Save this in matrix $\small L_2$ and so on until $\small L_n = T_1 \cdot T_2 \cdot T_3 \cdot \ldots \cdot T_n $ is a triangular matrix. The different version in the example above is then simply, that the order of the $\small T_k $ in that product is changed (according to the list, given in the rot(...)-command in MatMate. (you may try this using MatMate yourself)

[update] The "modified cholesky" seems to proceed by extracting the row/column (thr "factor") which has the highest entry on the diagonal ("variance") first . Then proceeds with the reduced matrix after extraction of the next row/column. This explains the final permutation order between the rows 1 and 2, which have initially the same diagonal value 6, and the factor in row 1 has even a greater "individual variance" than the factor in row 2. See the following protocol, where I used a copy "chk" of the original matrix A and proceeded to extract always that factor with the highest variance in the (residual) covariance-matrix (MatMate-command "RemVar" ("RemoveVariance of one factor"). Negative signs at the zero is spurious numerical machine-epsilon:

[24] chk = A
      chk : 
    6.00        3.00        4.00        8.00
    3.00        6.00        5.00        1.00
    4.00        5.00       10.00        7.00
    8.00        1.00        7.00       25.00

[25] chk = remvar(chk,4)
      chk : 
    3.44        2.68        1.76        0.00
    2.68        5.96        4.72        0.00
    1.76        4.72        8.04        0.00
    0.00        0.00        0.00        0.00

[26] chk = remvar(chk,3)
      chk : 
    3.05        1.65       -0.00        0.00
    1.65        3.19       -0.00        0.00
   -0.00       -0.00       -0.00        0.00
    0.00        0.00        0.00        0.00

[27] chk = remvar(chk,2)
      chk : 
    2.20        0.00        0.00        0.00
    0.00        0.00        0.00        0.00
    0.00        0.00       -0.00        0.00
    0.00        0.00        0.00        0.00

[Math] What are the properties of matrix inversion using the Cholesky decomposition for non symmetric matrices

Inverting a nonsymmetric matrix $A$ by $A^{-1}=A^T(AA^T)^{-1}$ using the Cholesky factorization of $AA^T$ is a pretty bad idea both in terms of complexity and numerical stability.

There is absolutely nothing saved in terms of the number of operations. Forming the product $AA^T$ prior to its factorization involves about $n^3$ flops (yes, yes, Strassen gives a better exponent). With the cost of $n^3/3$ flops of the Cholesky factorization this gives $4n^3/3$ flops in total. This is twice more expensive than LU and just same as the Householder QR factorization for $A$ directly.

Not only this approach of inversion is not faster in any way, it is also inferior to almost any other method one might think of. The numerical errors committed by solving systems like $AA^T$ are proportional to the square of the condition number of $A$ which in turn roughly halves the number of valid digits in the solution compared to more stable methods based on LU or QR.

Best Answer

Related Solutions

[Math] Modified Cholesky factorization and retrieving the usual LT matrix

[Math] What are the properties of matrix inversion using the Cholesky decomposition for non symmetric matrices

Related Question