[Math] Matrix inversion to solve least squares problem

inverseleast squareslinear algebra

I want to solve a least squares problem in the form of $\mathbf{A}\vec{x}=\vec{b}$, where $\mathbf{A}$ is a $m\times n$ matrix asociated to the transformation $T:\mathbb{R}^n \to \mathbb{R}^m$; and where $\vec{b} \notin C(\mathbf{A})$, menaing that there is no immediate solution for $\vec{x}$.

Even though $\mathbf{A}\vec{x}=\vec{b}$ has no solution, I know I can estimate the $\hat{\vec{x}}$ that would minimize the error, and that after some algebra (sorry, I have no reference at hand), it can be expressed as:

$$\mathbf{A}^{\intercal}\mathbf{A}\hat{\vec{x}} = \mathbf{A}^{\intercal}\mathbf{b}$$
My problem is that all text books I have come across stop here. THey state that the new formulation of the problem have a solution (that, I understand), but they say nothing about solving for $\hat{\vec{x}}$.
Typically I would go about and solve like this:

$$\hat{\vec{x}} = (\mathbf{A}^{\intercal}\mathbf{A})^{-1}\mathbf{A}^{\intercal}\mathbf{b}$$

However, this requires that $\mathbf{A}^{\intercal}\mathbf{A}$ be invertible.

Question: is $\mathbf{A}^{\intercal}\mathbf{A}$ always invertible? If so, how can I prove it?


Edit: for the particular case I'm interested in, $\mathbf{A}$ has more rows than columns, as it comes from an over-determined system.

Edit 2: I have also checked the formulation of my problem. All columns are linearly independent, meaning that for my particular case, $\mathbf{A}$ has full rank.

Best Answer

$\mathbf{A}^{\intercal}\mathbf{A}$ is certainly not always invertible. In fact, it will have the same rank as $\mathbf{A}$. Thus if $\mathbf{A}$ has more columns than rows, then $\mathbf{A}^{\intercal}\mathbf{A}$ will never be invertible. However, when doing least squares in practice, $\mathbf{A}$ will have many more rows than columns, so $\mathbf{A}^{\intercal}\mathbf{A}$ will have full rank and thus be invertible in nearly all cases.

Related Question