Solved – One group has only zero values, should I use parametric or non-parametric test

nonparametriczero inflation

I have 3 groups (n=5 per group) from biological data. I have checked the normality of my groups using the Shapiro-Wilk test and two of my groups are normally distributed. However, the 3rd group values are all zero, thus there is no variability in that group. The data I have is discrete and it can never be negative value as I am measuring number of worms in my experiments.
Should I use a parametric test (ANOVA) or a non-parametric test (Kruskal-Wallis) test to compare the 3 groups?

Other research groups with similar data have done an ANOVA but I am not 100% sure why they consider the group with all zeros to be normally distributed. Here is an example: https://www.ncbi.nlm.nih.gov/pubmed/29540816 Figure 2a – "sub-group 3" has values all equal to zero and is compared with one-way ANOVA to "sham" group

Here is an example of how my data looks like:
Group A (negative control): 110, 94 , 85, 67, 89
Group B (experimental group tested by other researches too): 0, 0, 0, 0, 0
Group C (experimental group): 24, 56, 67, 34, 26

All my subjects were infected and the groups represent different drug treatments. My measurement is number of worms – so if the treatment works then there will be fewer worms than the negative control or even no worms at all.

Disclaimer : I am not great with stats but all help is appreciated!

Best Answer

First, let me try to summarize:

  • You have 3 groups, N = 5 in each
  • You have a count of infections for each subject
  • In one group, the counts are all 0 and will always be 0. This is not about the sample, but about the population.
  • Your main interest is whether one of the groups with non-zero data is different from the group with all 0 data

Given the last question, your answer seems to be an automatic "yes". The question is only whether that difference is "significant". But, since there is no variation in the other group, you are, essentially, testing whether the counts are significantly different from 0.

Given your actual data (110, 94 , 85, 67, 89) I would be strongly tempted to avoid any significance testing at all and just say "Look! It's not 0!". If someone objected, you could just say "well, I can gather more observations and since some of the values will be different from 0, if I gather enough data, it will be significant". (Because the other group is all 0). As so often, significance doesn't seem to be the real issue here.

However, if you really wanted to do a significance test, I would probably first define exactly what you are looking for (mean? median? something else?) and then do a permutation test.

Related Question