Solved – How do we know $X’X$ is nonsingular in OLS

least squaresmatrix inverseregression

I am currently working through understanding the mechanics of OLS estimates and the hat matrix. One thing I have been searching for without luck is how we know that the term $X'X$ is invertible where $X'$ represents the transpose of $X$.

I understand that $X'X$ is a symmetric matrix, but I also know that being symmetric alone does not guarantee nonsingularity.

For reference I am referring to this equation:

$$ H = X(X'X)^{-1}X' $$

Any help with this is greatly appreciated.


Through the helpful answers below and a few other google searches I think I have found an answer to my question (at least for most cases).

When performing OLS. We have organized our data with $n$ observations and $p$ parameters. In almost every case $n > p$. This means that the columns of $X$ must be linearly independent. $X'X$ results in a matrix with $dim(X'X) = p$. This means that $X'X$ must also have columns that are linearly independent. Because $X'X$ is a square matrix (rows equal columns), it must have rows which are linearly dependent as well (i.e. $rank(X'X) = p$ aka "full rank"). A full rank matrix is always invertible.

Please correct me if I am wrong here, but I think the logic follow.

I used these questions as resources:

https://math.stackexchange.com/questions/2430179/if-x-is-linearly-independent-prove-xtx-is-positive-definite

https://math.stackexchange.com/questions/691812/proof-of-when-is-a-xtx-invertible

https://math.stackexchange.com/questions/214542/linear-independent-sets-of-non-square-matricies

Best Answer

It's a property of the $\text{rank}$ operator when its used on real matrices $\mathbf{A}$: $$ \text{rank}(\mathbf{A}) = \text{rank}(\mathbf{A}') = \text{rank}(\mathbf{A}'\mathbf{A}) = \text{rank}(\mathbf{A}\mathbf{A}'). $$

In your case, the data matrix $\mathbf{X} \in \mathbb{R}^{n \times p}$ is usually tall and skinny ($n > p$), so the rank of everything is the number of linearly independent columns/predictors/covariates/independent variables. If everything is linearly independent $\text{rank}(\mathbf{X}) = p$, and so you have $\mathbf{X}'\mathbf{X}$ is invertible. If you have collinearity, or columns that can be written as linear combinations of others, then $\text{rank}(\mathbf{X}) < p$, and you cannot find a unique inverse for $\mathbf{X}'\mathbf{X}$ (you can, however, find generalized inverses for it).

Related Question