Solved – Inference to the population when the survey response rate is only 30%

inferencemissing datanon-responsesamplingsurvey

I have conducted a survey in which the questionnaires were sent out to 450 individuals, but only 30% of them answered the questionnaires.

  • Is it still valid to interpret the usual inference analysis (i.e., the inference analysis developed under the assumption of random sampling)?
  • Is it correct to do logistic regression analysis with these data?
  • If not, how to proceed?

Best Answer

If it is "correct" to do logistic regression with these data depends on the type of non-response you have. Usually, one distinguishes three types of mechanisms for non-response.

  1. Missing completely at random: The non-response does not depend neither on the variable of interest nor the covariates.

  2. Missing at random, given some covariates: The non-response depends on some covariates but not on the variable of interest. Some people call this "ignorable non-response"

  3. Not missing at random: The non-response depends on the variable of interest, and cannot be completely explained by the observed covariates.

If you are in the first situation, non-response does not bias the results. It will merely induce a loss of precision in your estimates

If you are in the second situation you should be able to model non-response successfully and to adjust the data accordingly.

If you are in the third case, you are unlucky. There is not much you can do.

If you want to read about the topic, have a look at a textbook such as e.g. Sharon Lohr's "Sampling: Design and Analysis"

HTH

Related Question