[Math] Least squares fit linear algebra

linear algebra

Find the line y = ax+b which provides the best- t (that is, the least-squares
fit) for the data:

(x; y) = (-3; 11); (1;-17); (-2; 3); (4;-1)

SOLUTION: The initial (inconsistent) system of equations is

$\begin{pmatrix}\ -3 &1 \\ 1&1 \\ -2&1\\4&1 \end{pmatrix}\begin{pmatrix} a\\b \end{pmatrix} = \begin{pmatrix} 11\\-17\\3\\-1 \end{pmatrix} $

My question is where did all those 1 came from(the column of 1's) if they arent randomly generated numbers how do you find them? Thank you

Best Answer

The column of $1$'s are added to the design matrix because you want to fit a straight line with an intercept. They are not randomly generated. You should add this column every time you want to fit a line of the form $$ y=ax+b. $$ If somehow you believe that there is no intercept in the true model, i.e. you believe the true model is of the form $$ y=ax, $$ then you can delete the column of $1$'s and use just the observations of $x$. If you write $$ Y=\begin{pmatrix} 11\\-17\\3\\-1 \end{pmatrix} \text{ and } X=\begin{pmatrix}\ -3 &1 \\ 1&1 \\ -2&1\\4&1 \end{pmatrix}=(X_1,X_2) $$ the regression coefficients is a vector: $$ \begin{pmatrix} \hat a\\\hat b \end{pmatrix} =(XX')^{-1}X'Y $$ If you omit the column of $1$'s, you only get a scalar, which estimates the model $y=ax$: $$ \tilde a=(X_1X_1')^{-1}X_1'Y. $$

Related Question