Solved – Stratified or multi-stage sampling

descriptive statisticsexperiment-designsamplingstratification

I just started volunteering with a small organization that wants to estimate the average and total number environmental trouble spots in a large US city. They have a clear definition of an environmental trouble spot, and planned to survey the neighborhoods on foot to identify the trouble spots. Due to limited resources, they could only survey a fraction of the city.

Before I joined the organization, they carried out their own sampling procedure, which went like this. They divided the city up into 5 large "regions," each of which are made up of many small census blocks. Each region is composed of something like, say 15-30 census tracts. They wanted to sample from each of the 5 large regions and they only had resources to sample about 20 census tracts total for the whole city. They calculated the urban population residing in each of the 5 regions, and calculated population weights to determine the number of census tracts to randomly sample within the region (such that the total for the whole city didn't exceed 20). Their logic was that more environmental hot spots would be in populated areas. Here is an example.

Region 1 – 10% of city population – selected 2 random tracts
Region 2 – 40% of city population – selected 8 random tracts
Region 3 – 30% of city population – selected 6 random tracts
Region 4 – 10% of city population – selected 2 random tracts
Region 5 – 10% of city population – selected 2 random tracts

I joined the project after they completed this sampling, and they asked me to estimate the average number of hot spots per census tract, and also to estimate the total number of hot spots in the city (more of interest).

My problem is, do you know the name of this sampling strategy? Is it a form of stratified random sampling? Or multi-stage sampling? If I know the name or classification, I can better figure out what formula to use. Any other advice on analyzing this?

Best Answer

This is a stratified random sample. The strata are regions, fixed in advance, not sampled. The only sampling is simple random sampling of census tracts within strata. I suggest that you consult a standard text; I recommend Sharon Lohr, Sampling: Design & Analysis, 2009. The statistical packages Stata, SAS, SPSS Complex Samples, and R, all have commands for analyzing survey data. Note that you will benefit from incorporating the "finite population" correction to reduce standard errors.

Note that if there had been a second stage of sampling, e.g. a systematic sample of areas within each census tract, the design would be properly called a a stratified two-stage sample, with stratification at the first stage.

Related Question