Here is an example of how the effect of columnwise rotation occurs. See the protocol using my MatMate-program. The "rot"-command performs a column-wise rotation to triangular shape, where the order of the invovled rows (the rotation-citeria) is given as list, and the triangular form is so just permuted. However, the permutation includes also a re-computation of the diagonal (and the other entries, too)
;===========================================
;MatMate-Listing vom:06.12.2011 19:40:03
;============================================
[1] A = mk(4,4,6,3,4,8, 3,6,5,1, 4,5,10,7, 8,1,7, 25)
[2] L = cholesky(A)
A :
6.00 3.00 4.00 8.00
3.00 6.00 5.00 1.00
4.00 5.00 10.00 7.00
8.00 1.00 7.00 25.00
L :
2.45 0.00 0.00 0.00
1.22 2.12 0.00 0.00
1.63 1.41 2.31 0.00
3.27 -1.41 1.59 3.13
[3] M = rot(L,"drei",1´2´3´4)
Command [3] does not change anything because the order given is the default order. In
[4] M = rot(L,"drei",2´3´4´1)
M :
1.22 0.62 1.38 -1.48
2.45 0.00 0.00 0.00
2.04 2.42 -0.00 0.00
0.41 2.55 4.28 -0.00
[5] M = rot(L,"drei",3´4´1´2)
M :
1.26 1.16 1.75 0.00
1.58 -0.56 0.94 1.52
3.16 -0.00 -0.00 -0.00
2.21 4.48 0.00 -0.00
in command [6] we find the solution with the number 5 in the first column:
[6] M = rot(L,"drei",4´1´2´3)
M :
1.60 1.85 -0.00 -0.00
0.20 1.44 1.97 0.00
1.40 0.95 1.70 -2.06
5.00 0.00 -0.00 -0.00
After that I found the "modified cholesky" solution by the rotation in the following order:
[7] M = rot(L,"drei",4´3´2´1)
M :
1.60 0.62 0.92 1.48
0.20 1.66 1.79 0.00
1.40 2.84 0.00 -0.00
5.00 0.00 0.00 0.00
only that the rows are permuted.
So we can even systematically recover the original vanilla-cholesky matrix from that of the modified-cholesky-procedure: if we have r rows, then we need at most r(r-1)/2 rotations to triangular shape to recover the original vanilla-cholesky matrix.
[update 2] A remark about "rotation of columns".
Rotation of columns can be thought as postmultiplication of the cholesky
L by some rotation-matrix
T.
T itself is generated by successively rotating pairs of columns, so
T can be seen as product of elementary rotation-matrices $\small t_{k,j}=\begin{array} {rrrr} \cos(\varphi_{k,j}) & \sin(\varphi_{k,j}) \\ -\sin(\varphi_{k,j}) & \cos(\varphi_{k,j}) \end{array} $ where that
k,j'th rotation-parameters are inserted in the
ID-matrix at the appropriate pair of rows
k and
j and the same columns. Then $\small T = t_{1,2} \cdot t_{1,3} \cdot t_{1,4} \cdot \ldots t_{2,3} \cdot t_{2,4} \cdot \ldots t_{3,4} \cdot \ldots \cdot t_{n-1,n} $ where
n is the number of columns of the matrix which is to be rotated. In particular,
T is
not a permuationmatrix!
The different rotations mentioned in the example above occur, because for the triangular rotation we need to find a rotationmatrix $\small T_1 $ which rotates the matrix
L such that the entries in row
1 are collected in column
1, call this version of
L $\small L_1$. Next we need to find the rotationmatrix $\small T_2 $ which rotates then $\small L_1$ such that the entries in row
2 are collected in column
2 but where we do not touch the column
1. Save this in matrix $\small L_2$ and so on until $\small L_n = T_1 \cdot T_2 \cdot T_3 \cdot \ldots \cdot T_n $ is a triangular matrix. The different version in the example above is then simply, that the order of the $\small T_k $ in that product is changed (according to the list, given in the
rot(...)-command in MatMate. (you may try this using MatMate yourself)
[update]
The "modified cholesky" seems to proceed by extracting the row/column (thr "factor") which has the highest entry on the diagonal ("variance") first . Then proceeds with the reduced matrix after extraction of the next row/column. This explains the final permutation order between the rows 1 and 2, which have initially the same diagonal value 6, and the factor in row 1 has even a greater "individual variance" than the factor in row 2. See the following protocol, where I used a copy "chk" of the original matrix
A and proceeded to extract always that factor with the highest variance in the (residual) covariance-matrix (MatMate-command "RemVar" ("RemoveVariance of one factor"). Negative signs at the zero is spurious numerical machine-epsilon:
[24] chk = A
chk :
6.00 3.00 4.00 8.00
3.00 6.00 5.00 1.00
4.00 5.00 10.00 7.00
8.00 1.00 7.00 25.00
[25] chk = remvar(chk,4)
chk :
3.44 2.68 1.76 0.00
2.68 5.96 4.72 0.00
1.76 4.72 8.04 0.00
0.00 0.00 0.00 0.00
[26] chk = remvar(chk,3)
chk :
3.05 1.65 -0.00 0.00
1.65 3.19 -0.00 0.00
-0.00 -0.00 -0.00 0.00
0.00 0.00 0.00 0.00
[27] chk = remvar(chk,2)
chk :
2.20 0.00 0.00 0.00
0.00 0.00 0.00 0.00
0.00 0.00 -0.00 0.00
0.00 0.00 0.00 0.00
Best Answer
For the Householder approach, we will refer to these notes and for Cholesky, these notes.
$$A = \left( \begin{array}{cc} 10000. & 10001. \\ 10001. & 10002. \\ 10002. & 10003. \\ 10003. & 10004. \\ 10004. & 10005. \\ \end{array} \right), ~~b = \begin {pmatrix} 20001 \\20003\\20005\\20007\\20009\end{pmatrix}$$
$A_1 =A$
$x_1=\left( \begin{array}{c} 10000. \\ 10001. \\ 10002. \\ 10003. \\ 10004. \\ \end{array} \right)$
$n_1 = ||x_1||_2 = 22365.2$
$e_1=\left( \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \\ 0 \\ \end{array} \right)$
$v_1 = x_1 + n_1 e_1 = \left( \begin{array}{c} 32365.2 \\ 10001. \\ 10002. \\ 10003. \\ 10004. \\ \end{array} \right)$
$c_1 = \dfrac{2}{v_1^T.v_1} = 1.381498731530902 \times 10^{-9}$
$h_1 = I - c_1 v_1 . v_1^T = \left( \begin{array}{ccccc} -0.447124 & -0.447169 & -0.447214 & -0.447258 & -0.447303 \\ -0.447169 & 0.861822 & -0.138191 & -0.138205 & -0.138219 \\ -0.447214 & -0.138191 & 0.861795 & -0.138219 & -0.138233 \\ -0.447258 & -0.138205 & -0.138219 & 0.861767 & -0.138247 \\ -0.447303 & -0.138219 & -0.138233 & -0.138247 & 0.86174 \\ \end{array} \right)$
$A_2 = h_1.A_1 = \left( \begin{array}{cc} -22365.2 & -22367.4 \\ 0 & 0.0000382051 \\ 0 & -0.000061781 \\ 0 & -0.000161767 \\ 0 & -0.000261753 \\ \end{array} \right)$
$x_2 = \left( \begin{array}{c} 0.0000382051 \\ -0.000061781 \\ -0.000161767 \\ -0.000261753 \\ \end{array} \right)$
$n_2 = ||x_2||_2 = 0.000316165$
$e_2=\left( \begin{array}{c} 1 \\ 0 \\ 0 \\ 0 \\ \end{array} \right)$
$v_2 = x_2 - n_2 e_2 = \left( \begin{array}{c} -0.000277959 \\ -0.000061781 \\ -0.000161767 \\ -0.000261753 \\ \end{array} \right)$
$c_2 = \dfrac{2}{v_2^T.v_2} = 1.1379 \times 10^7$
$h_2 = I - c_2 v_2 . v_2^T = \left( \begin{array}{cccc} 0.120839 & -0.195408 & -0.511655 & -0.827902 \\ -0.195408 & 0.956567 & -0.113724 & -0.184015 \\ -0.511655 & -0.113724 & 0.702226 & -0.481824 \\ -0.827902 & -0.184015 & -0.481824 & 0.220367 \\ \end{array} \right)$
$A_3 = h_2.\left( \begin{array}{c} 0.0000382051 \\ -0.000061781 \\ -0.000161767 \\ -0.000261753 \\ \end{array} \right) = \left( \begin{array}{c} 0.000316165 \\ 0 \\ 0 \\ 0 \\ \end{array} \right)$
Following the notes, we have $A = QR = h_1.h_2.A_3$, which is
$\left( \begin{array}{ccccc} -0.447124 & 0.632519 & -0.207235 & 0.1811 & 0.569435 \\ -0.447169 & 0.316291 & -0.259445 & -0.455694 & -0.651944 \\ -0.447214 & 0.0000632303 & 0.892524 & -0.0577574 & -0.00803894 \\ -0.447258 & -0.316165 & -0.177773 & 0.758198 & -0.30583 \\ -0.447303 & -0.632392 & -0.248071 & -0.425846 & 0.396378 \\ \end{array} \right).\left( \begin{array}{cc} -22365.2 & -22367.4 \\ 0 & 0.000316165 \\ 0 & 0 \\ 0 & 0 \\ 0 & 0 \\ \end{array} \right)$
Using back substitution, $Rx = Q^T b$ gives
$$x = \begin{pmatrix} 1 \\ 1 \end{pmatrix}$$
For Cholesky, we have
$$A_C = A^T.A = \left( \begin{array}{cc} 500200030 & 500250040 \\ 500250040 & 500300055 \\ \end{array} \right)$$
$L_{11} = a_{11}^{1/2} = 22365.2, L_{21} = \dfrac{a_{21}}{\sqrt{a_{11}}} = 22367.4 , L_{22} = \sqrt{a_{22}-L_{21}^2}=0.000422864$
Now, we are ready to solve
$$Ax = b \implies LL^T x = b \implies Lc = b, L^Tx = c$$
of course, we arrive at the same result.