Linear Model – Effect of Standardization on Y-Intercept

multiple regressionstandardization

I was fiddling with my independent variables in a linear model and I note how the y-intercept becomes 0 when all the variables become standardized. Intuitively I understand that I'm shifting the line by changing with the mean of the normal distributed variables, but I was wondering if there is some more rigorous/theoretical way of explaining/deriving this?

Assuming we start with a model of the form

$$y = \text{X}b + e$$

where $y$ is a vector of the dependent variable, $\text X$ is the design matrix and $b$ is the coefficient matrix, and we subtract from every random variable its respective sample mean and divide by the respective sample standard deviation. How do we see that the design matrix gets all zeroes in its first column?

Best Answer

Your model is:

$$y_j = \text{X}_{j} b + \epsilon_j = \sum_{i=0}^px_{ij}b_i + \epsilon_j$$

Let $b_0$ be the intercept, so every $x_{0j} = 1$

$$y_j = b_0 + \sum_{i=1}^px_{ij}b_i + \epsilon_j$$

The average then becomes:

$$\mathbf E[y]=\hat y = \frac{1}{n}\sum_{j=1}^n y_j = \frac{1}{n}\sum_{j=1}^n \left( b_0 + \sum_{i=1}^px_{ij}b_i + \epsilon_j\right)=\\ =\frac{1}{n} n \cdot b_0 + \frac{1}{n}\sum_{j=1}^n \left(\sum_{i=1}^px_{ij}b_i\right) + \frac{1}{n}\sum_{j=1}^n\epsilon_j=\\ =b_0 + \sum_{i=1}^p \left(\frac{1}{n}\sum_{j=1}^nx_{ij}\right)b_i $$

As we made the average of each column of $\mathbf X$ equal to $0$, we get:

$$\hat y = b_0$$

If you standardize $y$ as well, then:

$$\hat y = b_0 = 0$$

QED.

See it only depends on the centering, not on the scale of the variables.


This can easily be shown in R. Compare the three fits, and specially fit2 with mean(y)

x = iris$Petal.Width
y = iris$Petal.Length
fit1 = lm(y ~ x)
fit2 = lm(y ~ I(scale(x)))
mean(y)
fit3 = lm(I(scale(y)) ~ I(scale(x)))
Related Question