I'm wondering if in the same analysis I can use countries as fixed effect and as a cluster for the robust standard errors.
Background: I'm running a multivariate logistic regression with the glm()
package in R.
My outcome variable is intimate partner violence and my predictors include income inequality, age, education, under 18y living at home, alcohol abuse, and violent behavior towards others.
My supervisor said I should include clustered robust standard errors in my analysis and I do have 28 countries in my sample, therefore, my cluster is countries. Right now, I'm not very interested in what those differences are.
My idea was to include country as dummy variables to account for country effects in the analysis. But since now I will use country as my cluster for the robust standard errors, I'm no longer sure if I need to use as the country effects as well.
Can I use country as both fixed effects and as the cluster for my robust standard errors?
Can someone please help me? Any answer or link to a material would be great.
Best Answer
In case of linear models, heteroscedasticity will not affect the point estimates, and you can use clustered standard errors. Whether you should do this or not is discussed, for example, here. This however does not translate for nonlinear models.
Jeff Wooldridge writes for example here:
And you should careful think about the (non-linear) model you are estimating. One pragmatic opportunity could be to use fixed-effects OLS for binary data as often been done in econometrics and suggested for example by Joshua Angrist and Jörn-Steffen Pischke:
PS: As commented by Frank Harrell, this approach is however controversial since you use by definition a model which is not suited for the (binary) data at hand (see for example the blogpost by Dave Giles here).