How do we find the slopes/beta coefficient in a multiple linear regression

linear regressionmatricesmatrix equationsregressionstatistics

Today I started to learn multiple linear regression, and after reading some articles and watching some videos about it, I came came across the equation $$\hat{y} = \hat\beta_0+\hat\beta_1x_1+\cdots+\hat\beta_nx_n$$

which is quite similar to simple linear regression, but it has an $n$ independent variables that needs to be addressed unlike in simple linear regression where you only have one

and after some time I encounter the equation for finding all the beta coefficients which is $\hat\beta = (X^TX)^{-1}(X^TY)$, (or maybe this is not it, I'm sorry I just started studying simple linear regression a day ago)

so I decided to try it in python, below are the sample $x_1,x_2$ independent variables, and $y$ dependent variables that I have

$y = (5,6)$, $x_1 = (4, 5)$, $x_2 = (2, 3)$

Then I got the $X$ matrix which is just the $x_1$ and $x_2$ but we need to add a scalar value of 1 in front of each, in my understanding this is needed to address the $\beta_0$ coefficient

$X=\begin{pmatrix}1&x_{11}&x_{12}\\1&x_{21}&x_{22}\end{pmatrix}=\begin{pmatrix} 1&4&5\\1&2&3\end{pmatrix} $

but when I got to the part of the equation for $\hat\beta$ where we need to find $(X^TX)^{-1}$, it seems that the resulting matrix of $X^TX$ is actually not invertible/cannot find the inverse, how do we actually get this? or is this not the proper way to get the $\hat\beta$ matrix?

Best Answer

You've got a couple of issues here. First, assuming you're using $x_1$ and $x_2$ to represent different explanatory variables, they should be columns of $X$, not rows (the same way you've got a column of 1's for the constant). So, you should have

$$ X=\begin{pmatrix}1&x_{11}&x_{21}\\1&x_{12}&x_{22}\end{pmatrix}=\begin{pmatrix} 1&4&2\\1&5&3\end{pmatrix} .$$

Next, you say that $x_1$ and $x_2$ are independent vectors. The trouble is, they're not. If they were, you'd have no problem inverting $X^TX$. But, the way you've constructed $x_1$ and $x_2$, you can transform between them using $x_{2i} = x_{1i} - 2$. You need to make up example vectors that are linearly independent.

I'd recommend doing something like having more observations than variables. This is pretty typical when you're doing data analysis (a much larger sample size than the number of explanatory variables), and it makes it a lot harder for you to end up with linearly dependent explanatory variables if you're making them up. Make $y$, $x_1$, and $x_2$ each 10-dimensional vectors ($n = 10$ observations), and give it a try.

Related Question