Solved – Calculation of intercept in multiple linear regression (OLS)

least squaresmultiple regression

While researching OLS, I found out the equation to calculate coefficients as:

$$
\beta = (X^\top X)^{-1}X^\top y
$$

(Ref: https://en.wikipedia.org/wiki/Linear_least_squares)

However it does not explicitly mention how to calculate the intercept. So what is the equation to calculate it?

Best Answer

You can obtain the solution for the intercept by setting the partial derivative of the squared loss with respect to the intercept $\beta_0$ to zero. Let $\beta_0 \in \mathbb{R}$ denote the intercept, $\beta \in \mathbb{R}^d$ the coefficients of features, and $x_i \in \mathbb{R}^d$ the feature vector of the $i$-th sample. We have to solve

\begin{align} \arg\min_{\beta_0} \quad& \mathcal{L}(\beta_0, \beta) \\ \mathcal{L}(\beta_0, \beta) &= \frac{1}{2} \sum_{i=1}^n (y_i - \beta_0 - x_i^\top \beta)^2 \\ \frac{\partial}{\partial \beta_0} \mathcal{L}(\beta_0, \beta) &= -\sum_{i=1}^n (y_i - \beta_0 - x_i^\top \beta) = 0 \end{align}

All we have to do is to solve for $\beta_0$:

\begin{align} \sum_{i=1}^n \beta_0 &= \sum_{i=1}^n (y_i - x_i^\top \beta) \\ \beta_0 &= \frac{1}{n} \sum_{i=1}^n (y_i - x_i^\top \beta) \end{align}

Usually, we assume that all features are centered, i.e., $$\frac{1}{n} \sum_{i=1}^n x_{ij} = 0 \qquad \forall j \in \{1,\ldots,d\},$$ which simplifies the solution for $\beta_0$ to be the average response: \begin{align} \beta_0 &= \frac{1}{n} \sum_{i=1}^n y_i - \frac{1}{n} \sum_{i=1}^n \sum_{j=1}^d x_{ij} \beta_j \\ &= \frac{1}{n} \sum_{i=1}^n y_i - \sum_{j=1}^d \beta_j \frac{1}{n} \sum_{i=1}^n x_{ij} \\ &= \frac{1}{n} \sum_{i=1}^n y_i \end{align}

If in addition, we also assume that the response $y$ is centered, i.e., $\frac{1}{n} \sum_{i=1}^n y_i = 0$, the intercept is zero and thus eliminated.

Best Answer

Related Solutions

Weighted Least Squares – Comparison of lm-function and Manual Calculation Methods

Solved – the relationship of long and short regression when we have an intercept

Related Question