Regression – Least Squares Estimate Without Intercept Term

least squaresmatrixregression

Using the least squares estimate, how do I calculate the estimate for $\beta$? For example, say I have the (small) dataset:

Observation	x	y
1	2	5
2	7	3

How do I compute the least squares estimate of $\hat\beta$ while assuming the model $y=\beta x_1+\varepsilon$ where $\varepsilon$ is the error term?

Moreover, if I am trying to calculate $\hat\beta$ using the formula $\hat\beta=(X^TX)^{-1}X^Ty$, what would $X$ and $Y$ be in this case?

Thank you!

Best Answer

Let's work through your example.

$$ L(y,\beta) = \sum\bigg( y_i - \beta x \bigg)^2 = \bigg( 5-2\beta \bigg)^2 + \bigg( 3-7\beta \bigg)^2\\ =25-20\beta+4\beta^2 + 9 - 42\beta+49\beta^2\\ =34 - 62\beta + 53\beta^2 $$

The (square) loss function is $L = 34 - 62\beta + 53\beta^2$. Now take the derivative and set it equal to zero.

$$ \dfrac{dL}{d\beta} = 106\beta - 62 =0\\ \implies \beta = 62/106 $$

By the geometry, we know this is a minimum, though you can feel free to test the second derivative (a worthwhile exercise if you have not done it).

In matrix notation, we do the same as we usually would. Since there is no intercept, that usual column of $1$s is gone, so we simply wind up with $X = (2, 7)^T$, and the matrix algebra follows.

Since we know that the OLS solution is $(X^TX)^{-1}X^Ty$, we can skip the formal calculus and go right to this. More likely, though, we would do this calculation on a computer. In R, we specify that the intercept should be excluded with a command like lm(y ~ 0 + x). Such a command agrees with the calculus-based solution.

x <- c(2, 7)
y <- c(5, 3)
L <- lm(y ~ 0 + x)
summary(L)

Best Answer

Related Solutions

Least Squares – Regression Without Intercept: Deriving $\hat{\beta}_1$

Solved – Can gradient descent find a better solution than least squares regression

Related Question