[Math] Solution to Ax=b with Least Squares

linear algebra

Let's suppose $A$ is a matrix such that $\ker A=\{0\}$ and $b$ is a vector not in the image of $A$. This situation implies $Ax=b$ does not have a solution.

This is where least squares comes in. Since $b$ is not in the image, we project $b$ onto the image of $A$ to get some vector $b'$ where then the equation $Ax=b'$ is solvable by $\hat{x}$. This $\hat{x}$ is not the least-squares solution to $Ax=b$.

My question is, if the columns of $A$ are linearly independent and we project $b$ to get $b'\in image(A)$, then why can't we just row-reduce $Ax=b'$ to get the solution $\hat{x}?$ Instead, we must solve the normal equation $A^TAx=A^Tb.$ I get that $(A^TA)^{-1}A^T$ is a projection map, but I am looking for something more fundamental.

Best Answer

Remember : Since you estimating a solution using LSQ, the more the data you account for, the better.

Example.

Suppose you have 3 variables.

Theoretically, you need only $3$ equations i.e. a $3\times 3 $ matrix is what you need.

However if you have one more equation from the data collected i.e. a $4\times 3$ matrix, theoritically the last equation should be a linear combination of the above $3$ . However practiacally, (quite obvious) things will differ.

So, now since the columns are independent, if you go by row-reduction , you will take account of only the first $3$ equations, i.e. the data from the $4th$ equation is not taken into account.

However , If you multiply by $A^t$, you get a square matrix, and it will be invertible (WHY?) .But now, you have accounted for the entire data!

So the latter ( by Remember ) will give a better approxiamtion than Row-reduction.