Solved – How centering can ease interpretation of the intercept of a linear model

correlationfittinginterceptlinear model

In Statistical Rethinking, Chap. 4 – page 99, when talking about table of estimates of a linear model $\mu_i = \alpha + \beta x_i$ where the objective is to estimate the height given the weigth ($x_i$), it is written that:

Notice that α and β are almost perfectly negatively correlated. Right now, this is harmless. It just means that these two parameters carry the same information—as you change the slope of the line, the best intercept changes to match it. But more complex models, strong correlations like this can make it difficult to fit the model to the data. So we’ll want to use some golem engineering tricks to avoid it, when possible.

The first trick is centering. Centering is the procedure of subtracting the mean of a variable from each value.

[…]

The estimates for β and σ are unchanged (within rounding error), but the estimate for α (a) is now the same as the average height value in the raw data. […] And the correlations among parameters are now all zero. What has happened here?

The estimate for the intercept, α, still means the same thing it did before: the expected value of the outcome variable, when the predictor variable is equal to zero. But now the mean value of the predictor is also zero. So the intercept also means: the expected value of the outcome, when the predictor is at its average value. This makes interpreting the intercept a lot easier.

I don't understand how this eases the interpretation of the intercept. I can see however that by having removed the correlation between the two parameters it is easier to fit the model to the data. Is this what the author is saying here?

Best Answer

The reason why data is centered for Bayesian regression is that for some types of sampling (ex. Gibbs) having highly correlated parameters increases the autocorrelation in chains and leads to less effective and longer sampling procedure.

As far as the interpretation is concerned: Before centering $\alpha$, its meaning is the height of a person with weight 0. Thus, $\alpha$ would be a just a numerical intercept (possibly negative) without any real world interpretation. Centering the data around the mean of the weight, will lead to $\alpha$ representing the height of a person with average weight. In this case, the numeric value of $\alpha$ has a real world interpretation.

Related Question