[Math] Complete Pivoting VS Partial Pivoting in Gauss Elimination

gaussian eliminationlinear algebramatricesnumerical methods

I have a hard time understanding that when and under what conditions we can use Gauss elimination with complete pivoting, and when with partial pivoting, and when with no pivoting? (I mean what is the exact feature of a matrix that will tell us which one to choose?)

Best Answer

Rule of Thumb/TL;DR: When doing calculations using floating point numbers (such as double, single, and float data types in many common programming languages), use partial pivoting unless you know you're safe without it and complete pivoting only when you know that you need it.


Longer Explanation: A square matrix $A$ has an $LU$ factorization (without pivoting) if, and only if, no zero is encountered in the pivot position when computing an $LU$ factorization of $A$. However, when does computations using floating point numbers a pivot that is nearly zero can lead to dramatic rounding errors. The simple workaround is to always permute the rows of the matrix such that the largest nonzero entry in a column is chosen as the pivot entry. This ensures that a nearly zero is never chosen. Complete pivoting goes even further by using row and column permutations to select the largest entry in the entire matrix as the pivot entry.

The above paragraph is a lose, intuitive picture of why pivoting is necessary. One can also prove sharp error bounds by carefully tracking the propagation of errors throughout the entire $LU$ factorization calculation. One way of structuring this error bound is a so-called backward error estimate. In a backwards error estimate for solving a linear system of equations $Ax = b$, one bounds the perturbation $E$ necessary to make the computed solution $\hat{x}$ produced by doing Gaussian elimination followed by back substitution an exact solution to a nearby system of linear equations $(A+E)\hat{x} = b$. A backwards error estimate can be very revealing if, for example, the matrix $A$ is determined by measurements of an engineering system with some error tolerance. If the entries of $A$ are only known $\pm 1\%$ and the backwards error is less than $0.1\%$, then the numerical errors made during our computations are smaller than the measurement errors and we've done a good job. TL;DR the quantity $E$ is a reasonable quantity to measure the error in an $LU$ factorization and resulting linear solve.

For Gaussian elimination without pivoting, the backwards error can be arbitrarily bad. Fortunately, for partial pivoting, the backwards error $E$ can be bounded as $\|E\|_\infty \le 6n^3 \rho \|A\|_\infty u + \mbox{higher order terms in $u$}$ ${}^\dagger$. Here, $\|\cdot\|_\infty$ is the operator matrix $\infty$-norm and $u$ is the unit roundoff which quantifies the accuracy of floating point computations ($u \approx 10^{-16}$ for IEE double precision arithmetic). The quantity $\rho$ is known as the growth factor for partial pivoting. While it is possible for $\rho$ to be as large as $2^{n-1}$, in practice $\rho$ usually grows very modestly with $n$.${}^\dagger$ Concerning the fact that $\rho$ grows very modestly in most applications, legendary numerical analyst Kahan wrote "Intolerable pivot-growth [with partial pivoting] is a phenomenon that happens only to numerical analysts who are looking for that phenomenon."${}^\$$

Nonetheless, one can write down matrices for which partial pivoting will fail to give an accurate answer due to an exponential growth factor $\rho = 2^{n-1}$. Wilkinson showed that the growth factor for complete pivoting is much smaller in the worst case $\rho \le n^{1/2} (2\cdot 3^{1/2}\cdots n^{1/n-1})^{1/2}$. In practice, $\rho$ for complete pivoting is almost always less than $100$.${}^\dagger$

After having delved into the details, you can see that this is a subtle business, and even in the classical field of numerical linear algebra there is somewhat of a gap between theory and experiment. In general, Gaussian elimination with partial pivoting is very reliable. Unless you know you can get away without pivoting (symmetric positive definite and diagonally dominant matrices are notable examples), one should use partial pivoting to get an accurate result. (Or compensate with something clever. For example, SuperLU uses "static pivoting" where one does "best guess" pivoting before starting the $LU$ factorization and then does no pivoting during the factorization. The loss of accuracy of this approach is compensated by using a few steps of iterative refinement.)

If partial pivoting isn't accurate enough, one can move to using complete pivoting instead for its lower growth factor. As user3417 points out, there are other ways of solving $Ax = b$ other than using $LU$ factorization-based approaches and these may be faster and more accurate than Gaussian elimination with complete pivoting. For example, $QR$ factorization runs in the $O(n^3)$ operations and has no growth factor. There may be special cases when one truly does want to use an $LU$ factorization: for example, a Gaussian elimination-based approach can be used to construct structure-preserving factorizations of a Cauchy-like matrix. In this case, complete pivoting (or its close cousin rook pivoting) may be the best approach.


${}^\dagger$ Reference Golub and Van Loan's Matrix Computations Fourth Edition Chapter 3.4

${}^\$$ Quoted in Higham's Accuracy and Stability of Numerical Algorithms Second Edition Chapter 9