Solved – way to compare means on 2 independent groups with non-normal distribution, unequal variances and sample size

hypothesis testingnonparametricsample-sizevariancewilcoxon-mann-whitney-test

As you can see on the question, I've got these two groups.

Group A: n=1520
Group B: n=115

I aim to compare their scoring on a 15-item instrument, but these scorings don't have a normal distribution, are unequal on variances and as you can see the sample sizes have a huge difference between the two groups. I've tried Mann-Whitney before, but as far as I know it assumes equal variances. If anyone knows which analysis is the correct to run on this case please let me know.

Best Answer

Mann-Whitney, and many other nonparametric tests, assumes stochastic ordering of the samples (or more for other tests). Essentially, this means that the two distributions do not cross when they are different; one cumulative distribution is above the other or to the left of the other.

If two sample are normally distributed, then the only way they can be stochastically ordered is if their variances are equal. The same is true for other symmetric, two parameter distributions such as the logistic. So for these kinds of distributions, equal variances are required for a properly interpretable test using rank-based tests, such as the Mann-Whitney.

Several distributions have different variances because their means are different, such as the exponential. In many cases, as the mean increases the variance must also increase. Many of these are still appropriate for rank tests such as the Mann-Whitney.

First, don't worry about it so much. Go ahead and do the rank test for small samples. For large samples such as yours I doubt that a test of equal means is very informative. Nonparametric tests really shine where the samples are small and it is difficult to use methods that depend on estimating parameters, fitting distributions, etc.

Second, examine some graphs that show the cumulative distributions of the samples you are comparing. If they cross over once then they are likely not stochastically ordered. If they don't cross, except maybe at the low end, they they are likely stochastically ordered. Do the rank test you like. If they cross back and forth then they are more likely stochastically ordered. Do the rank test you like but it is unlikely to be significant.

Try fitting some distributions to the large sample, then estimate the corresponding parameters for the small sample. See what they look like and describe them. (This is useful.)

If you have observations that are bounded at zero, eg, times or counts, then you are likely to have a distribution where the variance and the mean are functionally related.

The following material added in response to a Comment

Let me first note that I am not responsible for Wikipedia. I often find it useful, though information from it should be taken cautiously.

There are two articles there, as shown below. I don't know which one of those, or another one, you referred to. The first article uses the term "stochastic ordering" the second uses the term "stochastic dominance." They mean the same thing.

Under the null hypothesis the Mann-Whitney assumes that the two distributions are the same. (No parameters need to be mentioned here.) The alternative that is tested for is that one distribution is stochastically greater than the other. You might not regard this as an assumption of the test. But if the two distributions are not stochastically the same or stochastically ordered then the alpha level or p-value of the test is not correct. (I regard this as an assumption of the test, though others might differ on what that expression means.)

As to the normal or gaussian distribution, if the variances differ then the assumptions of the Mann-Whitney test do not hold. Other distributions with different variances are suitable for the Mann-Whitney if they are stochastically ordered.

This term was used by Mann and Whitney in their article in 1947, according to Wikipedia.

Mann–Whitney U test https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test

A thorough analysis of the statistic, which included a recurrence allowing the computation of tail probabilities for arbitrary sample sizes and tables for sample sizes of eight or less appeared in the article by Henry Mann and his student Donald Ransom Whitney in 1947.[1] This article discussed alternative hypotheses, including a stochastic ordering (where the cumulative distribution functions satisfied the pointwise inequality FX(t) < FY(t)). This paper also computed the first four moments and established the limiting normality of the statistic under the null hypothesis, so establishing that it is asymptotically distribution-free.

Talk: Mann-Whitney_U_test https://en.wikipedia.org/wiki/Talk:Mann%E2%80%93Whitney_U_test#What_do_you_need_to_assume_under_the_null_hypothesis.3F

Really, this test is precisely only for testing stochastic dominance of two variables A and B, that is, of Prob(A>B) > Prob(B>A). In other words, it tests whether a randomly chosen sample from A is expected to be greater than a sample from B.