Linear Model Intercept – Why Does Removing It Not Change Prediction With Factor Predictors?

categorical datacategorical-encodinginterceptlinear model

In a linear model that predicts birth rate (TFR) per country from per capita GDP, the country is encoded in "treatment coding", and there are several measurements (different years) per country. I would thus have thought that the first level represents the "reference intercept" and the predictions for only this first level should change when the intercept is removed from the model.

However, the predictions do not change for any country, if I remove the intercept:

> fit1 <- lm(TFR ~ logGDPpc + logGDPpc2 + 
                    country, data=x)
> fit2 <- lm(TFR ~ logGDPpc + logGDPpc2 + 
                    country - 1, data=x)
> max(abs(fit1$fitted.values - 
          fit2$fitted.values))
[1] 1.847411e-13

This also applies to the relative error of the differences:

> max(abs((fit1$fitted.values - 
   fit2$fitted.values)/fit2$fitted.values))
[1] 7.482906e-14

Is this the expected behavior? Why?

Best Answer

As @Russ Lenth points out these models have equivalent parametrizations.

Usually (in R) we specify models with a formula such as y ~ x1 + x2. It's very convenient. Under the hood, R uses the formula and the data to come up with the design matrix.

It's often helpful to look at the design matrix to figure out how R processed the inputs, esp. if the formula includes categorical variables, polynomials or other variable transformations. Use the model.matrix function to construct the design matrix explicitly.

sample_size <- 10
n_levels <- 3

x_cat <- factor(sample(1:n_levels, sample_size, replace = TRUE))
x_num <- rnorm(sample_size)
# We don't need a response to construct the design matrix.


# The two formulas correspond to equivalent design matrices.
# -> 
# They are the same model and the fitted values are the same, 
# up to some negligible numerical differences.


# The design matrix with an intercept doesn't have a dummy variable for the reference level.
model.matrix(~ x_num + x_cat)
#>    (Intercept)      x_num x_cat2 x_cat3
#> 1            1 -0.9633999      0      1
#> 2            1  0.9592475      0      0
#> 3            1 -0.9279922      1      0
#> 4            1 -0.2097351      1      0
#> 5            1 -0.5812370      1      0
#> 6            1  0.6245961      0      1
#> 7            1 -0.9484379      0      1
#> 8            1 -0.8772716      0      1
#> 9            1  0.8568915      0      1
#> 10           1  1.6237805      0      0
#> attr(,"assign")
#> [1] 0 1 2 2
#> attr(,"contrasts")
#> attr(,"contrasts")$x_cat
#> [1] "contr.treatment"

# The design matrix with an intercept has a dummy variable for the reference level.
model.matrix(~ x_num + x_cat - 1)
#>         x_num x_cat1 x_cat2 x_cat3
#> 1  -0.9633999      0      0      1
#> 2   0.9592475      1      0      0
#> 3  -0.9279922      0      1      0
#> 4  -0.2097351      0      1      0
#> 5  -0.5812370      0      1      0
#> 6   0.6245961      0      0      1
#> 7  -0.9484379      0      0      1
#> 8  -0.8772716      0      0      1
#> 9   0.8568915      0      0      1
#> 10  1.6237805      1      0      0
#> attr(,"assign")
#> [1] 1 2 2 2
#> attr(,"contrasts")
#> attr(,"contrasts")$x_cat
#> [1] "contr.treatment"