Can’t understand step in LU decomposition proof

linear algebramatrices

UPDATE: The author of the linked article has updated the proof and now it's perfectly clear.

I'm reading about the LU decomposition on this page and cannot understand one of the final steps in the proof to the following:

Let $A$ be a $K\times K$ matrix. Then, there exists a permutation matrix $P$ such that $PA$ has an LU decomposition: $$PA=LU$$ where $L$ is a lower triangular $K\times K$ matrix and $U$ is an upper triangular $K\times K$ matrix.

I'll reproduce the proof here:

Through Gaussian elimination, $A$ can be reduced to a row-echelon (hence upper triangular) matrix $U$ via a series of $n$ elementary operations: $$E_n\bullet\ldots\bullet E_1\bullet A = U$$
Any elementary matrix $E_i$ used in Gaussian elimination is either a permutation matrix $P_i$ or a matrix $L_i$ used to add a multiple of one row to a row below it. Thus, $L_i$ will be lower triangular with non-zero diagonal $\implies$ invertible. Suppose that the first permutation matrix we encounter is in position $j$, so that we have:
$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet\ldots\bullet L_1\bullet A = U$$
Since a permutation matrix is orthogonal, $P_j^T P_j = I$ and hence
$$E_n\bullet\ldots\bullet E_{j+1}\bullet P_j\bullet L_{j-1}\bullet P_j^TP_j \bullet\ldots\bullet P_j^TP_j\bullet L_1\bullet P_j^TP_j\bullet A = U$$ or
$$E_n\bullet\ldots\bullet E_{j+1}\bullet\tilde{L}_{j-1}\bullet\ldots\bullet\tilde{L}_1\bullet P_j\bullet A = U$$
for $i=1,\ldots,j-1$. A matrix $L_i$, used to add $\alpha$ times the $i$-th row to the $s$-th row (in this case $s>i$), can be written as a rank one update to the identity matrix: $L_i=I+M$, where $M$ is a matrix whose entries are all zeros except $M_{si} = \alpha$. We have that
$$\tilde{L}_i = P_jL_iP_j^T = P_j(I+M)P_j^T = P_jP_j^T + P_jMP_j^T = I+P_jMP_j^T$$

So far so good. Then we have:

The permutations performed on the rows and columns of $M$ by $P_j$ and $P_j^T$ can move the only non-zero element of $M$, but that element remains below the main diagonal (because $j>i$).

I can't understand this. $M$ is derived from $L_i$, which means that $M_{si} = \alpha \neq 0$ and all other elements of $M$ are $0$. Nothing has been said about $P_j$. I'm guessing $P_j$ swaps $j$-th row with $r$-th row, where $r>j$. What do the indices $s,i$ have anything to do with the indices $r,j$?

Would be grateful if anyone could clear this up!

Best Answer

Your proof states $P_j$ is a permutation matrix. There is no explicit restriction on what $P_j$ permutes, but there is an implicit one, which has to do with the order in which row reduction steps are done.

Since $j > i$, meaning that $P_j$ was performed after the lower triangulars $L_1, L_2, \ldots, L_{j-1}$, it must be the case that $P_j$ only permutes rows $j+1$ and greater (since rows 1 through $j$ are taken care of by the lower triangulars $L_1, L_2, \ldots, L_{j-1}$). So, if $P_j$ swaps the rows $q$ and $r$, we have $q,r > j \ge s > i$.

Basically, the implicit restriction on $P_j$ leaves $\tilde L_i = L_i$, since the elements of $P_j$ and $P_j^T$ are the same as $I$ for rows/columns $1$ through $j$.

FWIW - the fact that this detail was not mentioned in my opinion makes this proof fairly poorly constructed, and introducing all of the $\tilde L$ notation makes it difficult to parse. A much better proof approach would be to demonstrate that $L_i$ and $P_j$ commute.


Example. In this example, we will take

  • $i = 1$ (column number of nonzero element of $M$)
  • $s = 2$ (row number of nonzero element of $M$)
  • $j = 2$ (row reduction step number of permutation $P$)
  • $q = 3$ (one of two rows permuted by $P$)
  • $r = 4$ (one of two rows permuted by $P$)

A possible choice of $L_1$ and $P_2$ are as follows.

$$L_1 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 2 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \hspace{1 in} P_2 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{bmatrix}$$

From these, we can derive $M$ and $P^T$.

$$M = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \hspace{1 in} P_2^T = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{bmatrix}$$

Now, compute $\tilde L_1 = P_2 MP_2^T$ and notice that it equals $L_1$, and is thus lower diagonal.

$$\tilde L_1 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{bmatrix}\begin{bmatrix} 0 & 0 & 0 & 0 \\ 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{bmatrix}$$

$$\tilde L_1 = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} = L_1$$