Solved – Survey regression in R with singleton PSUs

rregressionsurvey

I am completely new to R, just downloaded and installed it today. I am familiar with SAS and Stata; I am using R because I have found out that in survey regression analysis, R is capable of using data that have stratum with one PSU. However, I cannot figure out how to write the code at all.

Here is what I have done so far: read a Stata dataset and save the .RData file. I have also put in the MASS, pscl, and survey (for svyglm) packages.

Here's what I need to do:
1) I am using survey data, so I have a "weight" variable, a "strata" variable, and a "PSU" variable. I need to incorporate those; I know how to use svyset in Stata, but no idea in R.
2) I have stratum with singleton PSUs. I need to use an option called survey.lonely.psu I believe, and I have no idea where to even begin with that. This is the reason why I am using R instead of Stata as I do not want to collapse stratum or delete observations.
3) The types of regression models I have to run: survey negative binomial, survey zero-inflated negative binomial (need to also determine the predictors of zeros), survey logistic, and survey OLS regression.
4)I also really can't make much sense in R of how to write the model in R code. In Stata, I can simply write the model as:

svy: nbreg dependent_var independent_var1 independent_var2 independent_var3

I can't figure out how to do that at all in R.

Any and all help will be greatly appreciated.

Best Answer

You need to install the survey package. Here is an example of how to define the survey design you have specified and how to run a linear regression on these data. I assume that the dataset has already been loaded.

require(survey)
options(survey.lonely.psu = "adjust")
design1 <- svydesign(id = ~psuid, strata = ~stratvar, weights = ~weightvar, data = mydata)
model1 <- svyglm(y ~ x1 + x2, design = design1)
summary(model1)

IMHO, Thomas Lumley's homepage is an excellent starting point for this kind of things.

Rather than only installing the survey package, you can install the Official Statistics task view:

install.packages("ctv")
install.views("OfficialStatistics")

This task view gives you a rather nice and complete toolbox to work with survey data.

Note that with Stata's svyset command you have basically the same possibilities than you have in R to handle singleton sampling units.

Related Question