Solved – Paired t-test or Wilcoxon signed-rank test on small sample

hypothesis testingnormality-assumptiont-testwilcoxon-signed-rank

I have ten observations each from a measurement ($precision$) for two algorithms (let them be called $A$ and $B$) for the same data. I wish to test the alternative hypothesis that the mean precision of $A$ is statistically significantly greater than the mean precision of $B$.

I'm unsure of whether I should use a paired t-test or a Wilcoxon signed-rank test. When I run a Shapiro-Wilk normality test to check for normality, I get p-values > 0.05, indicating that I cannot reject the null hypothesis that they belong to a normally distributed population. However, I cannot conclude they belong to a normally distributed population either.

In light of this uncertainty, I feel that I should go with the Wilcoxon signed-rank test because it makes no assumptions about normality, rather than a t-test which assumes a normal distribution. Is this thinking correct?

Edit:
Here are the two qq-plots for $precision$ observations for $A$ and $B$.

enter image description here

enter image description here

Best Answer

First of all there is no way to prove that any data set comes from an exact normal distribution. The idea of goodness of fit tests is to show empirically whether or not the distribution is at least close to normal.

Given that your question really gets to the heart of parametric versus nonparametric inference. You can always feel better about making fewer assumptions with nonparametric methods. Some people call it safe or conservative. So why ever use parametric inference? The answer in efficiency. Theoretically, when the parametric model holds the best parametric estimate is more efficient. In your case the paired t test is more efficient than the signed rank test under normality.

In practice when you have a sample size as low as 10 it is hard to reject normality using any goodness of fit test. So unless you have a strong reason to believe normality (with considerations outside your sample) use the signed rank test. It should be reassuring that these non parametric tests have reasonably high efficiency.