Your proof states $P_j$ is a permutation matrix. There is no explicit restriction on what $P_j$ permutes, but there is an implicit one, which has to do with the order in which row reduction steps are done.
Since $j > i$, meaning that $P_j$ was performed after the lower triangulars $L_1, L_2, \ldots, L_{j-1}$, it must be the case that $P_j$ only permutes rows $j+1$ and greater (since rows 1 through $j$ are taken care of by the lower triangulars $L_1, L_2, \ldots, L_{j-1}$). So, if $P_j$ swaps the rows $q$ and $r$, we have $q,r > j \ge s > i$.
Basically, the implicit restriction on $P_j$ leaves $\tilde L_i = L_i$, since the elements of $P_j$ and $P_j^T$ are the same as $I$ for rows/columns $1$ through $j$.
FWIW - the fact that this detail was not mentioned in my opinion makes this proof fairly poorly constructed, and introducing all of the $\tilde L$ notation makes it difficult to parse. A much better proof approach would be to demonstrate that $L_i$ and $P_j$ commute.
Example. In this example, we will take
- $i = 1$ (column number of nonzero element of $M$)
- $s = 2$ (row number of nonzero element of $M$)
- $j = 2$ (row reduction step number of permutation $P$)
- $q = 3$ (one of two rows permuted by $P$)
- $r = 4$ (one of two rows permuted by $P$)
A possible choice of $L_1$ and $P_2$ are as follows.
$$L_1 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 2 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix} \hspace{1 in} P_2 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{bmatrix}$$
From these, we can derive $M$ and $P^T$.
$$M = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} \hspace{1 in} P_2^T = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{bmatrix}$$
Now, compute $\tilde L_1 = P_2 MP_2^T$ and notice that it equals $L_1$, and is thus lower diagonal.
$$\tilde L_1 = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{bmatrix}\begin{bmatrix} 0 & 0 & 0 & 0 \\ 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0\end{bmatrix}$$
$$\tilde L_1 = \begin{bmatrix} 0 & 0 & 0 & 0 \\ 2 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix} = L_1$$
Best Answer
In LU decomposition you convert A into an upper triangular matrix and the operations you do can be expressed by lower triangular matrices. Lets say for argument sake there are no swap required so $L_kL_{k-1}...L_1A=U$. Now product of bunch of lower triangular matirces is lower triangular and inverse of lower triangular matrix is also lower triangular so $A=LU$. Now if swaps were required in the decomposition and you knew them beforehand and then you can apply them first and do your decomposition(without any swaps required now) so if P encodes all the swaps then you have $PA=LU$.