Solved – Compare two samples with many zeros

mathematical-statisticsstatistical significancezero inflation

We carried out a number of some experiments and got 10 independent 2-samples datasets.

Is it possible to show a significant difference between the two samples, if each of them contains more than 75% zeros (and we don't want to exclude zeros from these samples)?

Example of sample's box plots obtained by one of our experiments below:

two samples

It is important to note that in 10 independent models (experiments) the difference is approximately the same visually, but Kolmogorov-Smirnov, Brunner-Munzel and Wilcoxon tests show unstable p-values for different models.

What statistical test should we use to show the significance of differences in these cases? Or zero-values filtering is necessary?

Best Answer

With a lot of zeros in both series, it may be difficult to reject that the null that the means are the same. But you could test for differences in the deciles (or other quantiles) from the two distributions. If the tails are different then the samples are different. For a test see Li, Tiwari and Wells (1996).