I have been looking at the modified Cholesky decomposition suggested by the following paper: Schnabel and Eskow, A Revised Modified Cholesky Factorization Algorithm, SIAM J. Optim. 9, pp. 1135-1148 (14 pages).
The paper talks about an implementation of the Cholesky decomposition, modified by the fact that when the matrix is not positive definite, a diagonal perturbation is added before the decomposition takes place. The algorithm given in the paper (Algorithm 1) suggests that it finds factorization such that $LL^\top = A + E, E \ge 0$. But, when implemented as stated in the paper, what I get is a lower triangular factor matrix $L$, such that:
$$P\cdot(L\cdot L^\top)\cdot P^\top = A + E$$
where, $P$ is a permutation matrix.
This $L$ matrix is not the same as when using the normal Cholesky factorization for a PD matrix, where $A = L_1L_1^\top$, say.
Now, I am using it in optimization context, specifically in trust region methods, where this decomposition is followed by inverting $L$ (to compute the trust region step), so it would be helpful to have factors in lower triangular form. Is there a way to get back $L_1$ (the original Cholesky factor) for a positive definite matrix from the $P$ and $L$ matrix obtained from the modified Cholesky factorization? I am a bit surprised that the algorithm misstates what it would produce, so maybe I am missing some step here.
Related threads I have found so far:
The second thread says that the relationship is not possible to find for any given set of matrix (not necessarily for a modified Cholesky decomposition as mentioned here). Not sure if the answer still holds in this case as well.
Example:
Consider the following $4\times4$ matrix which is PD.
$$A = \begin{bmatrix}6 & 3 & 4 & 8 \\ 3 & 6 & 5 & 1 \\ 4 & 5 & 10 & 7 \\ 8 & 1 & 7 & 25 \end{bmatrix}$$
The vanilla Cholesky factor $L_1$, such that $L_1L_1^\top=A$ is:
$$L_1 = \begin{bmatrix}2.4495 & 0 & 0 & 0 \\ 1.2247 & 2.1213 & 0 & 0 \\ 1.6330 & 1.4142 & 2.3094 & 0 \\ 3.2660 & -1.4142 & 1.5877 & 3.1325 \end{bmatrix}$$
Now, if I perform the modified Cholesky, I get:
$$L=\begin{bmatrix}5 & 0 & 0 & 0 \\ 1.4 & 2.83549 & 0 & 0 \\ 0.2 & 1.66462 & 1.78579 & 0 \\ 1.6 & 0.62070 & 0.92215 & 1.48471 \end{bmatrix}$$
and
$$P=\begin{bmatrix}0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix}$$
Such that, $P\cdot (L\cdot L^\top)\cdot P^\top = A$. Of course, since $A$ is PD, $E=0$.
Best Answer
Here is an example of how the effect of columnwise rotation occurs. See the protocol using my MatMate-program. The "rot"-command performs a column-wise rotation to triangular shape, where the order of the invovled rows (the rotation-citeria) is given as list, and the triangular form is so just permuted. However, the permutation includes also a re-computation of the diagonal (and the other entries, too)
Command [3] does not change anything because the order given is the default order. In
in command [6] we find the solution with the number 5 in the first column:
After that I found the "modified cholesky" solution by the rotation in the following order:
only that the rows are permuted.
So we can even systematically recover the original vanilla-cholesky matrix from that of the modified-cholesky-procedure: if we have r rows, then we need at most r(r-1)/2 rotations to triangular shape to recover the original vanilla-cholesky matrix.
[update 2] A remark about "rotation of columns". Rotation of columns can be thought as postmultiplication of the cholesky L by some rotation-matrix T. T itself is generated by successively rotating pairs of columns, so T can be seen as product of elementary rotation-matrices $\small t_{k,j}=\begin{array} {rrrr} \cos(\varphi_{k,j}) & \sin(\varphi_{k,j}) \\ -\sin(\varphi_{k,j}) & \cos(\varphi_{k,j}) \end{array} $ where that k,j'th rotation-parameters are inserted in the ID-matrix at the appropriate pair of rows k and j and the same columns. Then $\small T = t_{1,2} \cdot t_{1,3} \cdot t_{1,4} \cdot \ldots t_{2,3} \cdot t_{2,4} \cdot \ldots t_{3,4} \cdot \ldots \cdot t_{n-1,n} $ where n is the number of columns of the matrix which is to be rotated. In particular, T is not a permuationmatrix!
The different rotations mentioned in the example above occur, because for the triangular rotation we need to find a rotationmatrix $\small T_1 $ which rotates the matrix L such that the entries in row 1 are collected in column 1, call this version of L $\small L_1$. Next we need to find the rotationmatrix $\small T_2 $ which rotates then $\small L_1$ such that the entries in row 2 are collected in column 2 but where we do not touch the column 1. Save this in matrix $\small L_2$ and so on until $\small L_n = T_1 \cdot T_2 \cdot T_3 \cdot \ldots \cdot T_n $ is a triangular matrix. The different version in the example above is then simply, that the order of the $\small T_k $ in that product is changed (according to the list, given in the rot(...)-command in MatMate. (you may try this using MatMate yourself)
[update] The "modified cholesky" seems to proceed by extracting the row/column (thr "factor") which has the highest entry on the diagonal ("variance") first . Then proceeds with the reduced matrix after extraction of the next row/column. This explains the final permutation order between the rows 1 and 2, which have initially the same diagonal value 6, and the factor in row 1 has even a greater "individual variance" than the factor in row 2. See the following protocol, where I used a copy "chk" of the original matrix A and proceeded to extract always that factor with the highest variance in the (residual) covariance-matrix (MatMate-command "RemVar" ("RemoveVariance of one factor"). Negative signs at the zero is spurious numerical machine-epsilon: