Solved – Statistical tests to do data quality checks


I want to identify bad responses in the data we get from our vendors. To assess the quality of the data, what statistical tests can i apply to identify bad respondents?

I performed Discriminant analysis to identify the % similarity of responses among surveys completed by a particular interviewer. How can i further authenticate my findings?

(added from comments)

What i mean by bad responses is: Interviewers who have very high similarity in the responses they collect in the survey. they might tick for "4" for a particular question for all the respondents (Flat Liners or Speeders)

Suppose an interviewer has conducted 10 surveys and i notice similarity in the answers of those questionnaires for each question. Is there any statistical test which authenticates my finding that the interviewer has probably filled responses himself or has ticked the same response for each question? How can we check a particular data using Statistical tests? I hope i have clarified my question.

Best Answer

The keywords to search for are "interviewer falsification". AAPOR/SRMS guidelines are a good starting place, and RTI system is useful, too. As far as I know the evidence, the interviewers may be able to get the first moments OK, but they have a more difficult time with higher order moments, so falsified data may be detected via unusual variances and correlations/crosstabs.

Related Question