Solved – Pre- post- questionnaire (t-test)

samplingsurveyt-test

I came across this forum as I am trying to get a convincing answer. Apologies for speaking 'plain English' as I am not a statistical genius!

Basically I am implementing a pre- and post- questionnaire to be asked to young people (aged 12-17), and aimed at demonstrating the impact of a day intervention (educational programme lasting one day). The idea is to ask students to self- assess before and after the programme delivery across a set of 5 indicators (teamwork, confidence, employability …). The rating scale will be 1-5 (strongly disagree to strongly agree) pr 1-4 (removing middle option).

My question relates to the need in pairing each pre- and post- answer to the same student? Indeed, I thought of two ways in collecting information. ONE giving bunches of pre- and post- forms to the class with no identifiers for each student; in which case I thought of aggregating all the pre- and post- answers for each indicators, then work out the difference in aggregate. TWO pairing the pre- and post- answers for each student (probably using a barcode or unique serial number), thereby enabling me to work out individual score difference which can also be aggregated subsequently.

I wonder which method results in the stronger results statistically speaking or if they are no difference in the end as arithmetically I will get to the same answer? I will use optical marking forms and the reason for my question is to decide whether time savings in the first method (whereas teachers can distribute forms regardless of who fills them in) will results in dubious results? I am considering the tradeoff between practicality on the ground and data solidity as I would like to work out statistically significant percentages increases (t-statistic that the software can work out automatically).

Many thanks for your responses.
M

Best Answer

While it is true that, for the same set of students, the mean pre-post difference will be the same between methods one and two, for statistical inference, all else being equal, the paired method is preferable. A common way to express this is to say that subjects "serve as their own controls." Through this method, one removes a portion of the variance in scores, and removing this portion of variability allows the factor of interest to show up more clearly. It might also seem intuitive that in the paired method, validity is higher because the two sets of scores are more plainly comparable than they would be if the pre- and post- groups were allowed (potentially) to comprise different subsets of students.

With regard to significance testing, the paired method figures to have higher power to detect a significant pre-post difference. The extent to which this is true is a function of the strength of any positive correlation between the two sets of scores. Power will also depend, as in method one, on sample size; variability within each set of scores; reliability of each indicator; alpha; and whether the test is one- or two-tailed.

By the way, if power is a main concern, then using classic scale development methods to combine the 5 indicators into a smaller number of more reliable ones should increase power as well. It's also possible that you would find this to compromise validity and the interpretability of findings, so it's by no means an automatic decision.

Related Question