Solved – plm in fixed effects model doesn’t work with id and time

multilevel-analysispanel dataplmregression

I am writing currently my thesis and I am stuck with a problem. I am trying to figure out how firm level, country level and industry level variables influence corporate social responsibility. I want to add Industry (SEC) and time fixed effects. But I can run the code only with one of them both. Actually the code should look like this, where the first part are country level specific variables and TA and LV are firm specific:

within <- plm(ESG ~ VOI_AC + Political.Stability + Government.effectiveness+ Regulatory.Quality +Rule.of.law + control.of.Corruption +Press.Freedom + pdi + idv + mas + uai + GI + HD + TA + LV, data=Neu1, index=c( "SEC", "year"),  model="within")

I tried some trouble shooting like:

Neu1$year <- group_indices(Neu1, year, SEC.NAME)

and some combinations out of this. I know the problem is, that I have duplicate observations per Industry and year. But this is, because different firms will be in the same industry in one year. I can't get rid of the error:

  duplicate couples (id-time)
Zusätzlich: Warnmeldungen:
1: In pdata.frame(data, index) :
  duplicate couples (id-time) in resulting pdata.frame
 to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany")
2: In is.pbalanced.default(index[[1]], index[[2]]) :
  duplicate couples (id-time)

The first part of my data looks like this:
enter image description here

The last part of my data looks like this:
enter image description here

I am thankful for any help. I am a newcomer to R and I really read a lot to this problem, but I didn't find a workaround.

Best Answer

I would suggest fitting a multilevel model, with company/firm nested within industry and firm also nested within country. This is just a special case of a mixed effects model and could be specified with this kind of formula (using the notation adopted by the lme4 library and others):

ESG ~ fixed_effects + (1 | industry) + (1 | industry:firm) + (1| country) + (1 | country:firm) 

which is equivalent to:

ESG ~ fixed_effects + (1 | industry/firm) + (1| country/firm)

So here we are fitting random effects (random intercepts) for the grouping factors, to handle the non-independence of observations within in grouping factor. In the mixed model framework, anything that is not specified in the random part of the formula is a fixed effect. Note that it is also possible for fixed effects to be allowed to vary within levels of the random intercept, by specifying them as random slopes. For example, if we wanted to allow for a fixed effect X to also vary within levels of country we could do so like this:

ESG ~ X + other_fixed_effects + (1 | industry) + (1 | industry:firm) + ( X | country) + (1 | country:firm) 

...and this is why it is sometimes better to write the expanded version of the formula

Note that with mixed models methodology it is not necessary to specify at what level a particular variable varies - provided that the nesting is specified correctly, the model will automatically handle variables that vary at the country level, or the industry level or the firm level. If you happen to specify a variable to be a random slope, ie to vary within by levels of a grouping variable, but it does not vary at that level, then the random effects will not be identified and you should get a singular fit or an error.

Related Question