Solved – Panel linear regression with duplicate time

panel dataplm

I am trying to carry out panel linear regression with yield and climate data from an project called ISIMIP. In this project, different groups of scientist running their crop models using the same GCMs so that they can compare their results to each other. I am using yield and climate data from 3 crop models that used 3 different GCMs (meaning I have 9 combinations of crop model-GCMs).

My plan is to run a panel linear regression through a yield-climate panel for each crop model (which means my panel consists of climate data from 3 different GCMs and the corresponding yield data of the same crop model). I am using 'plm' package from R. However, the issue I am running into now is as follow:

> panel.plr <- pdata.frame(panel, index = c('GCM','Grid','Year'), row.names = F)
Warning message:
In pdata.frame(panel, index = c("GCM", "Grid", "Year"), row.names = F) :
  duplicate couples (id-time) in resulting pdata.frame
 to find out which, use e.g. table(index(your_pdataframe), useNA = "ifany")

It is true since I am using 3 different climate series from 3 different GCMs. Therefore each year is duplicated three times.

Can someone please help me to find a way around this issue? Thank you very much.

Best Answer

Try as.data.frame rather than pdata.frame. I just got rid of the error through this simple change. I used as.data.frame for my panel regression (as below). I had four quarters data for > 10 year of > 10 firms.

pdata <- as.data.frame(emp_1990q1_2005q4)
fe2 <- plm(leverage ~ ibb + s, data = pdata, model = "within", effect = "twoways", index = c("id", "date"))
Related Question