If there's a given population X, and I chose to send out surveys to Y amount of people, but only Z amount of people respond – Would the sample size be Y, or Z? Survey Monkey states that the sample size is only the amount of responses received from the survey (respondents), while other websites state that it's the amount sent out and the amount returned does not change the sample size.
Solved – Does sample size include respondents
sample-size
Related Solutions
The short answer is yes: Survey Monkey ignores exactly how you obtained your sample. Survey Monkey is not smart enough to assume that what you have gathered isn't a convenience sample, but virtually every Survey Monkey survey is a convenience sample. This creates massive discrepancy in exactly what you're estimating which no amount of sheer sampling can/will eliminate. On one hand you could define a population (and associations therein) you would obtain from a SRS. On the other, you could define a population defined by your non-random sampling, the associations there you can estimate (and the power rules hold for such values). It's up to you as a researcher to discuss the discrepancy and let the reader decide exactly how valid the non-random sample could be in approximating a real trend.
As a point, there are inconsistent uses of the term bias. In probability theory, the bias of an estimator is defined by $\mbox{Bias}_n = \theta - \hat{\theta}_n$. However an estimator can be biased, but consistent, so that bias "vanishes" in large samples, such as the bias of maximum likelihood estimates of the standard deviation of normally distributed RVs. i.e. $\hat{\theta} \rightarrow_p \theta$. Estimators which don't have vanishing bias, (e.g. $\hat{\theta} \not\to_p \theta$) are called inconsistent in probability theory. Study design experts (like epidemiologists) have picked up a bad habit of calling inconsistency "bias". In this case, it's selection bias or volunteer bias. It's certainly a form of bias, but inconsistency implies that no amount of sampling will ever correct the issue.
In order to estimate population level associations from convenience sample data, you would have to correctly identify the sampling probability mechanism and use inverse probability weighting in all of your estimates. In very rare situations does this make sense. Identifying such a mechanism is next to impossible in practice. A time that it can be done is in a cohort of individuals with previous information who are approached to fill out a survey. Nonresponse probability can be estimated as a function of that previous information, e.g. age, sex, SES, ... Weighting gives you a chance to extrapolate what results would have been in the non-responder population. Census is a good example of the involvement of inverse probability weighting for such analyses.
How would you have analyzed the data if the 2 sample sizes had worked out to be the same? A paired test usually requires you to know the pairs, some form of ID that links the 2 surveys of the same person. A truly anonymous survey will not have this information available making the paired test impossible. Some surveys will include an arbitrary ID number that the respondent is to include both times and is (hopefully) unique to each respondent, but this has to be designed into the survey up front (and may reduce it from anonymous to confidential).
Also, are the 304 out of the 320 a random/representative sample? or could there be a bias? Are the 156 a random/representative sample of the 304? or could there be a bias? If those students who improved were more likely to answer the post survey than those who declined then that could greatly bias the results.
Are you planning on using a finite population correction?
These questions should be examined before the questions that you asked as they will probably have a much larger impact on your results than the bias of using an independent t-test. It may be that your best approach is to report summary statistics and not attempt any formal inference.
Edit
Here is some R code that simulates some data based on the original numbers and compares results:
library(MASS)
simfun <- function(r=0,d=0) {
x <- mvrnorm(320, c(0,d), matrix( c(1,r,r,1), 2 ))
x[ sample( 320, 16 ), 1 ] <- NA
x[ sample( 320, 164 ), 2 ] <- NA
c(paired = t.test( na.omit(x)[,1], na.omit(x)[,2], paired=TRUE)$p.value,
ind1 = t.test( na.omit(x)[,1], na.omit(x)[,2] )$p.value,
ind2 = t.test( na.omit(x[,1]), na.omit(x[,2]) )$p.value)
}
out <- replicate(10000, simfun(r=0,d=0))
out <- t(out)
pairs(out)
mean( out[,2] > out[,1] )
mean( out[,3] > out[,1] )
mean( out[,3] > out[,2] )
mean(out[,1] <= 0.05)
mean(out[,2] <= 0.05)
mean(out[,3] <= 0.05)
out <- replicate(10000, simfun(r=0.7, d=0.2))
out <- t(out)
pairs(out)
mean( out[,2] > out[,1] )
mean( out[,3] > out[,1] )
mean( out[,3] > out[,2] )
mean(out[,1] <= 0.05)
mean(out[,2] <= 0.05)
mean(out[,3] <= 0.05)
Running this code (and you can change to different values of r
and d
) shows that when there is no correlation and no difference then all 3 tests give the correct type I error rate. With correlation and no difference the proper paired test still gives the correct type I error rate and the other 2 give an error rate below what is specified (conservative). When there is a difference then the paired test has the most power.
So if you are happy with all the assumptions about representative samples and independence between responses and likelihood of responding, then you could use an independent t-test (even though you don't have independence) and just realize that the results will be conservative, p-values to large, confidence interval too wide, on average. If the test is significant you can be confident in a significant difference. The problem comes with p-values that are a little large than $\alpha$, they could represent a significant difference with inflated p-value.
Best Answer
Both numbers can be described as a sample size, because both the group of people you sent the survey to, and the subset of that group who responded to the survey, can be regarded as samples. It's common for research projects to have a hierarchy of samples, and hence a hierarchy of sample sizes, in this fashion. If you're wondering which number to report in the write-up of your study, the answer is that in general, you should report all of them.