Likelihood Ratio Test – Fixing Cluster Robust Standard Errors with Bootstrap

clustered-standard-errorshypothesis testinginferencelikelihood-ratiomaximum likelihood

There is a common agreement about the invalidity of using likelihood ratio tests when computing Maximum Likelihood Estimates (MLE) using clustered corrected standard errors. The main argument is that observations are no longer independent; hence, is it not a proper likelihood. The same argument goes when using weights in the estimation. However, I haven't happened to find any paper discussing said issues or offering corrected asymptotic distributions.
For instance, the official Stata's FAQ offers the following.

The “likelihood” for pweighted or clustered MLEs is not a true likelihood; i.e., it is NOT the distribution of the sample. When there is clustering, individual observations are no longer independent, and the “likelihood” does not reflect this. Where there are pweights, the “likelihood” does not fully account for the “randomness” of the weighted sampling.
The “likelihood” for pweighted or clustered MLEs is used only for the computation of the point estimates and should not be used for variance estimation using standard formulas. Thus the standard likelihood-ratio test should NOT be used after estimating pweighted or clustered MLEs. Instead of likelihood-ratio tests (the lrtest command), Wald tests (the test command) should be used.

They conclude by referring to a possible Bonferroni correction when using Wald tests. However, I am still wondering if the conventional likelihood ratio test is still ''salvable'' by, for example, adjusting the asymptotic distribution or using some sort of Bootstrap procedure. Finally, as far as I understand the problem, the issue is with the asymptotic distribution not being $\chi^{2}$.

Best Answer

Everything works better when you specify a full model that leads to a full likelihood. If you have exchangeability, add random effects. If you have serial correlation, use a serial correlation structure through the use of a Markov model or generalized least squares. Advantages of full models include

  • statistical efficiency
  • better fit
  • more accurate confidence intervals
  • ability to handle wildly varying cluster sizes

The cluster bootstrap is an approximate method that can yield disappointing confidence interval coverage and will not handle large clusters or varying cluster sizes well.

Related Question