[Math] Modified Cholesky factorization and retrieving the usual LT matrix

cholesky decompositionmatricesmatrix decompositionnonlinear optimizationnumerical linear algebra

I have been looking at the modified Cholesky decomposition suggested by the following paper: Schnabel and Eskow, A Revised Modified Cholesky Factorization Algorithm, SIAM J. Optim. 9, pp. 1135-1148 (14 pages).

The paper talks about an implementation of the Cholesky decomposition, modified by the fact that when the matrix is not positive definite, a diagonal perturbation is added before the decomposition takes place. The algorithm given in the paper (Algorithm 1) suggests that it finds factorization such that $LL^\top = A + E, E \ge 0$. But, when implemented as stated in the paper, what I get is a lower triangular factor matrix $L$, such that:
$$P\cdot(L\cdot L^\top)\cdot P^\top = A + E$$
where, $P$ is a permutation matrix.

This $L$ matrix is not the same as when using the normal Cholesky factorization for a PD matrix, where $A = L_1L_1^\top$, say.

Now, I am using it in optimization context, specifically in trust region methods, where this decomposition is followed by inverting $L$ (to compute the trust region step), so it would be helpful to have factors in lower triangular form. Is there a way to get back $L_1$ (the original Cholesky factor) for a positive definite matrix from the $P$ and $L$ matrix obtained from the modified Cholesky factorization? I am a bit surprised that the algorithm misstates what it would produce, so maybe I am missing some step here.

Related threads I have found so far:

  1. Cholesky factorization
  2. Cholesky decomposition and permutation matrix

The second thread says that the relationship is not possible to find for any given set of matrix (not necessarily for a modified Cholesky decomposition as mentioned here). Not sure if the answer still holds in this case as well.

Example:
Consider the following $4\times4$ matrix which is PD.
$$A = \begin{bmatrix}6 & 3 & 4 & 8 \\ 3 & 6 & 5 & 1 \\ 4 & 5 & 10 & 7 \\ 8 & 1 & 7 & 25 \end{bmatrix}$$
The vanilla Cholesky factor $L_1$, such that $L_1L_1^\top=A$ is:
$$L_1 = \begin{bmatrix}2.4495 & 0 & 0 & 0 \\ 1.2247 & 2.1213 & 0 & 0 \\ 1.6330 & 1.4142 & 2.3094 & 0 \\ 3.2660 & -1.4142 & 1.5877 & 3.1325 \end{bmatrix}$$
Now, if I perform the modified Cholesky, I get:
$$L=\begin{bmatrix}5 & 0 & 0 & 0 \\ 1.4 & 2.83549 & 0 & 0 \\ 0.2 & 1.66462 & 1.78579 & 0 \\ 1.6 & 0.62070 & 0.92215 & 1.48471 \end{bmatrix}$$
and
$$P=\begin{bmatrix}0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix}$$
Such that, $P\cdot (L\cdot L^\top)\cdot P^\top = A$. Of course, since $A$ is PD, $E=0$.

Best Answer

Here is an example of how the effect of columnwise rotation occurs. See the protocol using my MatMate-program. The "rot"-command performs a column-wise rotation to triangular shape, where the order of the invovled rows (the rotation-citeria) is given as list, and the triangular form is so just permuted. However, the permutation includes also a re-computation of the diagonal (and the other entries, too)

;===========================================
;MatMate-Listing vom:06.12.2011 19:40:03
;============================================
[1] A =  mk(4,4,6,3,4,8, 3,6,5,1, 4,5,10,7, 8,1,7, 25)
[2] L = cholesky(A)
      A : 
    6.00        3.00        4.00        8.00
    3.00        6.00        5.00        1.00
    4.00        5.00       10.00        7.00
    8.00        1.00        7.00       25.00

      L : 
    2.45        0.00        0.00        0.00
    1.22        2.12        0.00        0.00
    1.63        1.41        2.31        0.00
    3.27       -1.41        1.59        3.13

[3] M = rot(L,"drei",1´2´3´4)

Command [3] does not change anything because the order given is the default order. In

[4] M = rot(L,"drei",2´3´4´1)
      M : 
    1.22        0.62        1.38       -1.48
    2.45        0.00        0.00        0.00
    2.04        2.42       -0.00        0.00
    0.41        2.55        4.28       -0.00

[5] M = rot(L,"drei",3´4´1´2)
      M : 
    1.26        1.16        1.75        0.00
    1.58       -0.56        0.94        1.52
    3.16       -0.00       -0.00       -0.00
    2.21        4.48        0.00       -0.00

in command [6] we find the solution with the number 5 in the first column:

[6] M = rot(L,"drei",4´1´2´3)
      M : 
    1.60        1.85       -0.00       -0.00
    0.20        1.44        1.97        0.00
    1.40        0.95        1.70       -2.06
    5.00        0.00       -0.00       -0.00

After that I found the "modified cholesky" solution by the rotation in the following order:

[7] M = rot(L,"drei",4´3´2´1)
      M : 
    1.60        0.62        0.92        1.48
    0.20        1.66        1.79        0.00
    1.40        2.84        0.00       -0.00
    5.00        0.00        0.00        0.00

only that the rows are permuted.

So we can even systematically recover the original vanilla-cholesky matrix from that of the modified-cholesky-procedure: if we have r rows, then we need at most r(r-1)/2 rotations to triangular shape to recover the original vanilla-cholesky matrix.


[update 2] A remark about "rotation of columns". Rotation of columns can be thought as postmultiplication of the cholesky L by some rotation-matrix T. T itself is generated by successively rotating pairs of columns, so T can be seen as product of elementary rotation-matrices $\small t_{k,j}=\begin{array} {rrrr} \cos(\varphi_{k,j}) & \sin(\varphi_{k,j}) \\ -\sin(\varphi_{k,j}) & \cos(\varphi_{k,j}) \end{array} $ where that k,j'th rotation-parameters are inserted in the ID-matrix at the appropriate pair of rows k and j and the same columns. Then $\small T = t_{1,2} \cdot t_{1,3} \cdot t_{1,4} \cdot \ldots t_{2,3} \cdot t_{2,4} \cdot \ldots t_{3,4} \cdot \ldots \cdot t_{n-1,n} $ where n is the number of columns of the matrix which is to be rotated. In particular, T is not a permuationmatrix!
The different rotations mentioned in the example above occur, because for the triangular rotation we need to find a rotationmatrix $\small T_1 $ which rotates the matrix L such that the entries in row 1 are collected in column 1, call this version of L $\small L_1$. Next we need to find the rotationmatrix $\small T_2 $ which rotates then $\small L_1$ such that the entries in row 2 are collected in column 2 but where we do not touch the column 1. Save this in matrix $\small L_2$ and so on until $\small L_n = T_1 \cdot T_2 \cdot T_3 \cdot \ldots \cdot T_n $ is a triangular matrix. The different version in the example above is then simply, that the order of the $\small T_k $ in that product is changed (according to the list, given in the rot(...)-command in MatMate. (you may try this using MatMate yourself)
[update] The "modified cholesky" seems to proceed by extracting the row/column (thr "factor") which has the highest entry on the diagonal ("variance") first . Then proceeds with the reduced matrix after extraction of the next row/column. This explains the final permutation order between the rows 1 and 2, which have initially the same diagonal value 6, and the factor in row 1 has even a greater "individual variance" than the factor in row 2. See the following protocol, where I used a copy "chk" of the original matrix A and proceeded to extract always that factor with the highest variance in the (residual) covariance-matrix (MatMate-command "RemVar" ("RemoveVariance of one factor"). Negative signs at the zero is spurious numerical machine-epsilon:

[24] chk = A
      chk : 
    6.00        3.00        4.00        8.00
    3.00        6.00        5.00        1.00
    4.00        5.00       10.00        7.00
    8.00        1.00        7.00       25.00

[25] chk = remvar(chk,4)
      chk : 
    3.44        2.68        1.76        0.00
    2.68        5.96        4.72        0.00
    1.76        4.72        8.04        0.00
    0.00        0.00        0.00        0.00

[26] chk = remvar(chk,3)
      chk : 
    3.05        1.65       -0.00        0.00
    1.65        3.19       -0.00        0.00
   -0.00       -0.00       -0.00        0.00
    0.00        0.00        0.00        0.00

[27] chk = remvar(chk,2)
      chk : 
    2.20        0.00        0.00        0.00
    0.00        0.00        0.00        0.00
    0.00        0.00       -0.00        0.00
    0.00        0.00        0.00        0.00