Solved – How to compare if two multinomial distributions are significantly different

anovachi-squared-testhypothesis testingkullback-leiblerstatistical significance

We can use T test to check if two proportions are significantly different.
Similarly is there a way to test if two multinomial distributions or "2 samples with more than 2 unique values" are significantly different from each other.

For example, I have a sample (say sample 1) where it has 100 red balls, 300 green balls and 400 yellow balls and 200 orange balls and sample 2 has 101 red balls, 302 green balls and 399 yellow and 202 orange balls.

Is there a way to check if the above 2 samples are significantly different ( 2. Is this same as checking if 2 multinomial distributions are significantly different ). If so, can you explain how.

I was told in one of interviews that (if I remember correctly) KL divergence can be used to check this. 3. Can I use KL divergence for this (or to check if the sample multinomial distribution is significantly different from expected) ? 4. If so, how to check for significance with KL divergence or what's cutoff value of KLD to say that the difference is significant (like the p values in statistical tests). 5. Can I use ANOVA, chi square for these (if so, can you please explain)

Best Answer

You can perform the goodness of fit test. Given two vectors of data you test, through the chi-squared test, if they are significantly different or, given a vector of data, you test if their frequencies significantly differ from a given vector of probabilities.

Data comparison:

x1 = c(100, 300, 400, 200)
x2 = c(101, 302, 399, 202)
chisq.test(x=x1,y=x2)

Frequency comparison:

x1 = c(100, 300, 400, 200)
p = x2/(sum(x2))
chisq.test(x=x1,p=p)
OR
x2 = c(100, 300, 400, 200)
p = x1/(sum(x1))
chisq.test(x=x2,p=p)

Related Solutions

Solved – Test whether two multinomial samples come from the same distribution

You correctly performed a $\chi^2$-test of independence, so the only problem is in the formulation of its hypotheses and the interpretation of the test result:

The $\chi^2$-test of independence tests the null hypothesis "The two color distributions are equal" versus the working hypothesis of any difference. The p value is smaller than the prespecified level $\alpha$, so you reject the null hypothesis and claim with about $(1-\alpha)\cdot 100\%$ confidence that the colors are differently distributed between urns.

The term "independence"-test is sometimes a bit confusing but it is more clear if you consider the "raw" data behind the contingency table:

Color   Urn
Blue      1
Blue      2
Green     2
Red       1
Blue      1
...

The null hypothesis that the variable "Urn" is independent of the random variable "Color" is equivalent to the null hypothesis stated above. So it's not about independence of the two color distributions but about independence of color and urn.

Note that a large p value wouldn't mean that the color distributions were equal. This would be much harder to show by "classic" statistical methods.

Solved – How to compare two non-normally distributed samples with very different sizes? (Mann-Whitney vs Randomization/Bootstrap)

I am not a big expert on statistical testing, but the approach you are considering decidedly does not make sense. Imagine that the groups are indeed identical (i.e. null hypothesis is true). Then you will observe p<0.05 in exactly 5% of the cases, and e.g. p<0.01 in 1% of the cases (those would be false positives). So following your logic, you would reject the null.

I am not aware of any problems with Wilcoxon-Mann-Whitney test in case of different numbers of observations. So one option you have is to run the ranksum test as usual, without any further complications.

However, if you do feel concerned about the very different $N$, you can try a simple permutation test: pool both groups together (obtaining $81+5110=5191$ numbers) and randomly select $81$ values as group A and all the rest as group B. Then take the difference between the means (or medians) of A and B (let's call it $\mu$), and repeat this many many times. This will give you a distribution $p(\mu)$. At the same time for your actual groups X and Y you have some fixed empirical value of $\mu^*$. Now you can check if $\mu^*$ lies in the 95% percentile interval of $p(\mu)$. If it does not, you can reject the null with p<0.05.

Related Question