Regression – Understanding Linear Probability Model and Effect of Dummy Variables on Standard Errors

binary datageneralized linear modelregression

I am fitting the linear probability model,
$$
Y_i=\beta_0 + \sum_{j=1}^J \beta_j G_{ji} +\varepsilon_i
$$
where $Y_i \in \{0,1\}$ and $G_{ji} \in \{0,1\}$, for $j=1,\ldots,J$ and $\sum_{j=1}^J G_{ji}=1$, i.e., the $G_{ij}$ are groups and each individual is placed into one of these groups.

I use OLS to get estimates for $\beta_j$, $j=1,\ldots,J$ (leaving out one group to avoid multi-linearity) and I get my estimates. The weird thing is that the standard errors are the same for all $\beta_j$, $j=1,\ldots,J$, i.e. $se(\widehat \beta_1)=se(\widehat \beta_{2})=\ldots= se(\widehat \beta_J)$.

Is this normal that the standard errors are the same? It's not the first time that this has happened to me and I'm not sure if I'm doing something wrong.

Best Answer

Recall that the standard errors are the diagonal elements of the matrix $$ \hat\sigma^2(X'X)^{-1} $$ As pointed out by @repmat, this result requires that each group is of equal size, i.e., that $$\sum_iG_{ji}=c$$ for $j=1,\ldots,J$.

In that case, you can easily check that $$ X'X=n \begin{pmatrix} 1&1/J&\cdots&\cdots&\cdots&1/J\\ 1/J&1/J&0&\cdots&\cdots&0\\ 1/J&0&1/J&\ddots&\cdots&0\\ \vdots&0&0&\ddots&\ddots&0\\ \vdots&\vdots&\cdots&\ddots&\ddots&0\\ 1/J&0&\cdots&\cdots&0&1/J\\ \end{pmatrix}, $$ assuming that the first column contains the constant term. Assuming that one of the redundant dummies has been dropped, so that $X'X$ is of dimension $J\times J$, the inverse is $$ (X'X)^{-1}=\frac{1}{n}\begin{pmatrix} J&-J&\cdots&&\cdots&-J\\ -J&2J&J&\cdots&\cdots&J\\ \vdots&J&2J&J&&J\\ &\vdots&J&2J&\ddots&\vdots\\ \vdots&\vdots&&\ddots&2J&J\\ -J&J&\cdots&\cdots&J&2J \end{pmatrix} $$ We see that the diagonal elements are identical (except for the one on the constant), yielding identical standard errors.

The nature of the $y_i$ is irrelevant, as these enter the standard errors only through $\hat\sigma^2$, which however obviously just multiplies each element of $(X'X)^{-1}$ and hence does not affect the result that the diagonal elements are identical.

This result also shows that the squared standard error on the constant is one half that of the dummies.

P.S.: As suggested by Alecos in the comments, we may define the block matrix $$ X'X= n\begin{pmatrix} A&B\\C&D \end{pmatrix}, $$ with $A=1$, $B=(1/J,\ldots, 1/J)$, $C=B'$ and $D=I/J$ and use a result on partitioned inverses that the lower right block has the inverse $$ D^{-1}+D^{-1}C(A-BD^{-1}C)^{-1}BD^{-1} $$ to see that the result is as presented as above.

UPDATE: Regarding the discussion in the comments of @repmat's answer, the numerical equivalence does not exactly hold for robust standard errors.

This is because the "meat" matrix $X'\Sigma_uX$ of the robust variance estimator $$ (X'X)^{-1}X'\Sigma_uX(X'X)^{-1} $$ has a diagonal $$ \begin{pmatrix} \sum_{i=1}^n\hat{u}_i^2\\ \sum_{i=1,\,i\in j=1}^n\hat{u}_i^2\\ \vdots\\ \sum_{i=1,\,i\in j=J-1}^n\hat{u}_i^2\\ \end{pmatrix} $$ (assuming that group $J$ has been dropped to avoid multicollinearity), and there is in general no reason to believe that the sums of squared residuals belonging to the different groups are identical.

The differences are minor, though (at least under homoskedasticity). Here is a modification of his illustration:

set.seed(42)
n <- 999
library(sandwich)
library(lmtest)

year1 <- data.frame(rep(1, n/3))
year2 <- data.frame(rep(2, n/3))
year3 <- data.frame(rep(3, n/3))

require(plyr)
years <- rbind.fill(year1, year2, year3)
years[is.na(years)] <- 0
years <- as.factor(rowSums(years))

y <- round(runif(n),0)
reg <- lm(y ~ years)

coeftest(reg, vcov = vcovHC(reg, "HC0"))

> 

t test of coefficients:

             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.486486   0.027390 17.7616   <2e-16 ***
years2      -0.027027   0.038678 -0.6988   0.4849    
years3      -0.012012   0.038717 -0.3103   0.7564    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1