Solved – Does # of observations in each cluster matter for cluster-robust standard errors

cluster-sampleclustered-standard-errorsrobust

In multiple regression with panel data, when calculating cluster robust standard errors, does it matter if there are only a handful of observations in each cluster?

I am looking at if the relationship between Y and X changes after a policy change. I have about 400 firms with 3 years of financial data before the policy change, and 3 years after. I am running the following regression:

Y = Intercept + X + Post * X + controls;

"Post" is a dummy taking the value 1 (or 0) for a year after (or before) the policy change. The coefficient of the them "post * X" would indicate a change in the relation after the new policy is imposed.

To correct for serial correlation within each firm, I use cluster robust standard errors where the cluster is each firm. However, since I only have 6 observations in each cluster (3 before and 3 after the policy change), Do I still need use cluster-robust standard errors? Is clustering a valid concern in this case?…

Thanks!

Best Answer

Yes, you still should use clustering, even if you only have six observations per firm. The purpose of clustering is to account for the fact that if your observations are correlated, then you don't actually have as much information as the simple $n$ for your sample size would indicate. In the extreme case, where you had perfect correlation of the errors and $x$ values within each cluster (i.e. you just photocopied your data six times for each firm and pasted it all together), your actual effective sample size would be 1/6 as large as the number of observations that you technically count. To fail to use clustered errors here would be to mislead your audience about the amount of underlying data you are drawing from.

In fact, the asymptotic theory developed for clustered standard errors assumes a large number of clusters with relatively small numbers of observations per cluster. Thus, in many respects, your case seems pretty ideally suited to clustered errors.