I've been reading up on QR decomposition within the context of linear regression. The predictive equation from linear regression is :
$y_{predict} = \beta X$
Where $X$ is an [N samples, p variables] matrix) and $\beta$ are the coefficients. Following the Normal Equations approach and minimizing the square sum of the residuals, one finds that $\beta$ is :
$\beta = (X^{T} X)^{-1} X^{T} $
The problem with this that sometimes $X^{T} X$ is not invertible. So it seems that QR decomposition of X may be a way around it. I.e
$X = QR$
Using QR decomposition, the coefficients $\beta$ are :
$\beta = R^{-1}Q^{T}y$
The advantage seems obvious that it is easier to handle the upper triangular matrix, $R$, then trying to invert the possibly singular matrix $(X^{T} X)$. Regarding upper triangular matrices, Wikipedia says
Note that a triangular matrix is invertible precisely when its
diagonal entries are invertible (non-zero).
QUESTION : Under what conditions is $R$ non-invertible when QR decomposition is done in linear regression?
Best Answer
We know from the properties of determinants that
$\det(X) = det(Q)det(R)$
If $X$ is a square matrix (we'll handle non-square matrices later), we know that Q will be an orthogonal matrix. Orthogonal matrices have determinants equal to $1$ or $-1$. Which means if
$\det(X) = \pm 1 \det(R) = 0$
So $\det(R) = 0$ are when $\det(X) = 0$. This means that R is not invertible when $X$ is not full rank. Recall in a non-full rank matrix some of the rows are not linearly independent. Below is an example of such matrix:
$X = \begin{bmatrix} 6 & 0 & -6 \\ 2 & 4 & 1 \\ 1 & 4 & 2 \end{bmatrix}$
Very clearly, $row 1$ is a linear combination of $6 \times (row 2) - 6 \times (row 3)$. This means that A is not full rank, its determinant is 0 (as shown above) and is thus singular. Solving this with Python's numpy:
Clearly looking at the value of $R_{3,3} \approx 0$ (likely not $0$ b/c of numerical artifacts form the QR solver). This leads to exploding values in $R^{-1}$. Despite the fact that numpy 'inverted' $R$, one should be cautious of accepting such diverging values.
Another case where $R$ is not invertible, is when $X$ is an $m \times n$ matrix where $n>m$. This is also a case where $X$ is underdetermined. This leads to an $R$ matrix that is $m \times n$, which will not have an inverse.