[Math] Under what conditions does QR decomposition lead to a non-invertible upper triangular matrix

linear regressionmatrix decomposition

I've been reading up on QR decomposition within the context of linear regression. The predictive equation from linear regression is :

$y_{predict} = \beta X$

Where $X$ is an [N samples, p variables] matrix) and $\beta$ are the coefficients. Following the Normal Equations approach and minimizing the square sum of the residuals, one finds that $\beta$ is :

$\beta = (X^{T} X)^{-1} X^{T} $

The problem with this that sometimes $X^{T} X$ is not invertible. So it seems that QR decomposition of X may be a way around it. I.e

$X = QR$

Using QR decomposition, the coefficients $\beta$ are :

$\beta = R^{-1}Q^{T}y$

The advantage seems obvious that it is easier to handle the upper triangular matrix, $R$, then trying to invert the possibly singular matrix $(X^{T} X)$. Regarding upper triangular matrices, Wikipedia says

Note that a triangular matrix is invertible precisely when its
diagonal entries are invertible (non-zero).

QUESTION : Under what conditions is $R$ non-invertible when QR decomposition is done in linear regression?

Best Answer

We know from the properties of determinants that

$\det(X) = det(Q)det(R)$

If $X$ is a square matrix (we'll handle non-square matrices later), we know that Q will be an orthogonal matrix. Orthogonal matrices have determinants equal to $1$ or $-1$. Which means if

$\det(X) = \pm 1 \det(R) = 0$

So $\det(R) = 0$ are when $\det(X) = 0$. This means that R is not invertible when $X$ is not full rank. Recall in a non-full rank matrix some of the rows are not linearly independent. Below is an example of such matrix:

$X = \begin{bmatrix} 6 & 0 & -6 \\ 2 & 4 & 1 \\ 1 & 4 & 2 \end{bmatrix}$

Very clearly, $row 1$ is a linear combination of $6 \times (row 2) - 6 \times (row 3)$. This means that A is not full rank, its determinant is 0 (as shown above) and is thus singular. Solving this with Python's numpy:

import numpy as np
>>> X = np.asarray([[6, 0, -6], [2,4,1], [1,4,2]])
>>> np.linalg.det(X)
0.0
>>> (Q,R)=np.linalg.qr(X)
>>> R
array([[ -6.40312424e+00,  -1.87408514e+00,   4.99756038e+00],
       [  0.00000000e+00,  -5.33739683e+00,  -4.00304762e+00],
       [  0.00000000e+00,   0.00000000e+00,  -8.88178420e-16]])
>>> np.linalg.inv(R)
array([[ -1.56173762e-01,   5.48362688e-02,  -1.12589991e+15],
       [ -0.00000000e+00,  -1.87357252e-01,   8.44424930e+14],
       [ -0.00000000e+00,  -0.00000000e+00,  -1.12589991e+15]])

Clearly looking at the value of $R_{3,3} \approx 0$ (likely not $0$ b/c of numerical artifacts form the QR solver). This leads to exploding values in $R^{-1}$. Despite the fact that numpy 'inverted' $R$, one should be cautious of accepting such diverging values.

Another case where $R$ is not invertible, is when $X$ is an $m \times n$ matrix where $n>m$. This is also a case where $X$ is underdetermined. This leads to an $R$ matrix that is $m \times n$, which will not have an inverse.