[Math] Cholesky factorization

cholesky decompositionlinear algebramatrices

Posting link here because it is as much math as it is programming:

Replacing Cholesky factorization from IMSL MCHOL (Fortran) in C#

(Feel free to delete if inappropriate)

(Apologies for having lapsed into abject mathematical ignorance and incompetency over the last few decades).

Thank you for getting this far!

From the documentation of the function I'm trying to replicate:

Computes an upper triangular factorization of a real symmetric matrix A plus a diagonal matrix D, where D is determined sequentially during the Cholesky factorization in order to make A + D nonnegative definite.

Routine MCHOL computes a Cholesky factorization, RTR, of A + D where A is symmetric and D is a diagonal matrix with sufficiently large diagonal elements such that A + D is nonnegative definite. The routine is similar to one described by Gill, Murray, and Wright (1981, pages 108−111). Here, though, we allow A + D to be singular.

The algorithm proceeds sequentially by rows. If A + D is singular, the Cholesky factor R is taken to have some rows that are entirely zero. The i-th row of A + D is declared to be linearly dependent on the first i − 1 rows if the following two conditions are satisfied: (images that I can't really put in here, see the documentation here)

I want to make sure I really understand what is being solved for here. Cholesky factorization wants a positive definite matrix, so we're solving for the smallest modifier that does that? And after that, it's merely a matter of algorithm?

Best Answer

Here is a partial answer: why is such a routine is needed?

A Cholesky factorization is $XX^t = A$. But not all A have such a factorization. Whenever matrices get you down, try the 1×1 matrices: numbers. The transpose of [x] is just [x], and so we get: $$[x][x]^t = [x^2] = [a]$$ only works (for real numbers x), if a ≥ 0. You can't take the square root of a negative (and get a real number), and you can't take the square root of a "negative" (definite) matrix. So what happens if we ask for the Cholesky decomposition of [−2]? Well the routine says −2 is too small, so we'll just choose d = 2, and find the Cholesky factorization of [a] + [d] = [a + d] = [−2+2] = [0]. $$[0][0]^t = [0] = [-2] + [2]$$

Given any symmetric matrix, it can be thought of (after finding some other special decompositions) as a diagonal matrix: in other words as a few separate numbers. For example take a = diag(1,−2,3). We can almost find a nice Cholesky decomposition: $$\begin{bmatrix}1 & 0 & 0 \\ 0 & \sqrt{-2} & 0 \\ 0 & 0 & \sqrt{3} \end{bmatrix} \begin{bmatrix}1 & 0 & 0 \\ 0 & \sqrt{-2} & 0 \\ 0 & 0 & \sqrt{3} \end{bmatrix}^t = \begin{bmatrix}1 & 0 & 0 \\ 0 & -2 & 0 \\ 0 & 0 & 3 \end{bmatrix}$$ except for the pesky $\sqrt{-2}$ not being a real number. Never fear, d = diag(0,2,0) is here! $$\begin{bmatrix}1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & \sqrt{3} \end{bmatrix} \begin{bmatrix}1 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & \sqrt{3} \end{bmatrix}^t = \begin{bmatrix}1 & 0 & 0 \\ 0 & -2 & 0 \\ 0 & 0 & 3 \end{bmatrix} + \begin{bmatrix}0 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 0 \end{bmatrix} $$

Things are a little messier when A and D are not both diagonal (we can assume one is diagonal, but not both for this algorithm), but I think the basic idea is the same. Fix negative diagonal entries, well, negative eigenvalues.

Related Solutions

[Math] Modified Cholesky factorization and retrieving the usual LT matrix

Here is an example of how the effect of columnwise rotation occurs. See the protocol using my MatMate-program. The "rot"-command performs a column-wise rotation to triangular shape, where the order of the invovled rows (the rotation-citeria) is given as list, and the triangular form is so just permuted. However, the permutation includes also a re-computation of the diagonal (and the other entries, too)

;===========================================
;MatMate-Listing vom:06.12.2011 19:40:03
;============================================
[1] A =  mk(4,4,6,3,4,8, 3,6,5,1, 4,5,10,7, 8,1,7, 25)
[2] L = cholesky(A)
      A : 
    6.00        3.00        4.00        8.00
    3.00        6.00        5.00        1.00
    4.00        5.00       10.00        7.00
    8.00        1.00        7.00       25.00

      L : 
    2.45        0.00        0.00        0.00
    1.22        2.12        0.00        0.00
    1.63        1.41        2.31        0.00
    3.27       -1.41        1.59        3.13

[3] M = rot(L,"drei",1´2´3´4)

Command [3] does not change anything because the order given is the default order. In

[4] M = rot(L,"drei",2´3´4´1)
      M : 
    1.22        0.62        1.38       -1.48
    2.45        0.00        0.00        0.00
    2.04        2.42       -0.00        0.00
    0.41        2.55        4.28       -0.00

[5] M = rot(L,"drei",3´4´1´2)
      M : 
    1.26        1.16        1.75        0.00
    1.58       -0.56        0.94        1.52
    3.16       -0.00       -0.00       -0.00
    2.21        4.48        0.00       -0.00

in command [6] we find the solution with the number 5 in the first column:

[6] M = rot(L,"drei",4´1´2´3)
      M : 
    1.60        1.85       -0.00       -0.00
    0.20        1.44        1.97        0.00
    1.40        0.95        1.70       -2.06
    5.00        0.00       -0.00       -0.00

After that I found the "modified cholesky" solution by the rotation in the following order:

[7] M = rot(L,"drei",4´3´2´1)
      M : 
    1.60        0.62        0.92        1.48
    0.20        1.66        1.79        0.00
    1.40        2.84        0.00       -0.00
    5.00        0.00        0.00        0.00

only that the rows are permuted.

So we can even systematically recover the original vanilla-cholesky matrix from that of the modified-cholesky-procedure: if we have r rows, then we need at most r(r-1)/2 rotations to triangular shape to recover the original vanilla-cholesky matrix.

[update 2] A remark about "rotation of columns". Rotation of columns can be thought as postmultiplication of the cholesky L by some rotation-matrix T. T itself is generated by successively rotating pairs of columns, so T can be seen as product of elementary rotation-matrices $\small t_{k,j}=\begin{array} {rrrr} \cos(\varphi_{k,j}) & \sin(\varphi_{k,j}) \\ -\sin(\varphi_{k,j}) & \cos(\varphi_{k,j}) \end{array} $ where that k,j'th rotation-parameters are inserted in the ID-matrix at the appropriate pair of rows k and j and the same columns. Then $\small T = t_{1,2} \cdot t_{1,3} \cdot t_{1,4} \cdot \ldots t_{2,3} \cdot t_{2,4} \cdot \ldots t_{3,4} \cdot \ldots \cdot t_{n-1,n} $ where n is the number of columns of the matrix which is to be rotated. In particular, T is not a permuationmatrix!
The different rotations mentioned in the example above occur, because for the triangular rotation we need to find a rotationmatrix $\small T_1 $ which rotates the matrix L such that the entries in row 1 are collected in column 1, call this version of L $\small L_1$. Next we need to find the rotationmatrix $\small T_2 $ which rotates then $\small L_1$ such that the entries in row 2 are collected in column 2 but where we do not touch the column 1. Save this in matrix $\small L_2$ and so on until $\small L_n = T_1 \cdot T_2 \cdot T_3 \cdot \ldots \cdot T_n $ is a triangular matrix. The different version in the example above is then simply, that the order of the $\small T_k $ in that product is changed (according to the list, given in the rot(...)-command in MatMate. (you may try this using MatMate yourself)

[update] The "modified cholesky" seems to proceed by extracting the row/column (thr "factor") which has the highest entry on the diagonal ("variance") first . Then proceeds with the reduced matrix after extraction of the next row/column. This explains the final permutation order between the rows 1 and 2, which have initially the same diagonal value 6, and the factor in row 1 has even a greater "individual variance" than the factor in row 2. See the following protocol, where I used a copy "chk" of the original matrix A and proceeded to extract always that factor with the highest variance in the (residual) covariance-matrix (MatMate-command "RemVar" ("RemoveVariance of one factor"). Negative signs at the zero is spurious numerical machine-epsilon:

[24] chk = A
      chk : 
    6.00        3.00        4.00        8.00
    3.00        6.00        5.00        1.00
    4.00        5.00       10.00        7.00
    8.00        1.00        7.00       25.00

[25] chk = remvar(chk,4)
      chk : 
    3.44        2.68        1.76        0.00
    2.68        5.96        4.72        0.00
    1.76        4.72        8.04        0.00
    0.00        0.00        0.00        0.00

[26] chk = remvar(chk,3)
      chk : 
    3.05        1.65       -0.00        0.00
    1.65        3.19       -0.00        0.00
   -0.00       -0.00       -0.00        0.00
    0.00        0.00        0.00        0.00

[27] chk = remvar(chk,2)
      chk : 
    2.20        0.00        0.00        0.00
    0.00        0.00        0.00        0.00
    0.00        0.00       -0.00        0.00
    0.00        0.00        0.00        0.00

[Math] Cholesky factorization for a non-positive semidefinite matrix

Cholesky factorisation is only possible for symmetric matrices, and your $a$ is not symmetric.

First two sentences of Matlab's help:

CHOL(A) uses only the diagonal and upper triangle of A. The lower triangle is assumed to be the (complex conjugate) transpose of the upper triangle.

I agree it should warn if the matrix is not symmetric..

Best Answer

Related Solutions

[Math] Modified Cholesky factorization and retrieving the usual LT matrix

[Math] Cholesky factorization for a non-positive semidefinite matrix

Related Question