Variance Analysis for Overlapping Samples in Surveys Using ANOVA and Kruskal-Wallis Test

anovahypothesis testingkruskal-wallis test”statistical significancesurvey

How do I check if multiple groups are different regarding a certain property, if these groups overlap?

Example

I conduct a survey and ask participants two questions:

1. Tick all ice-cream flavors you like (multiple-choice)

    [ ] Vanilla   [ ] Chocolate   [ ] Strawberry   [ ] Banana   [ ] Mango


2. How often do you experience headaches? (single-choice)

    ( ) Very Often    ( ) Sometimes    ( ) Rarely    ( ) Never

So I get data that looks something like this:

participantId Vanilla Chocolate Strawberry Banana Mango Headaches
1 Yes No Yes No No Very Often
2 Yes No No No Yes Sometimes
3 No Yes No No No Rarely

Now I would like to check, if participants who prefer certain flavors experience headaches more often.

Which statistical test would be suitable to answer this question?

If I see correctly, I can't use the standard options, since

  • ANOVA and Kruskal-Wallis assume that variables are independent, so the groups have to be distinct, but this is not the case for me: If I would compare the Vanilla group and the Strawberry group, they have an overlap, which is participant 1.
  • The Friedman Test assumes repetitive measurements, so in my case it would require that each participant tried each flavor and then rated how much headache they experience.

What I tried so far

  • consider only participants who like exactly one flavor

    -> bad, because most participants ticked at least two options

  • consider each combination of flavors as a separate group

    -> bad, because there are many combinations, so the groups would get very small

  • create distinct groups via clustering participants who like similar flavors (e.g., via K-Means)

    -> could work, but is this the right way to approach this problem?

Is there a standard solution for this problem?

Best Answer

In my opinion it would be best to first pose a research question using your domain knowledge and then test for this specific question. So in your example, you might wonder if ice cream with fruit leads to more headache, which reduces the number of groups to 2.  This makes your findings interpretable and they are automatically backed by a non-statistical argumentation. In fact, statistical tests should not be used exploratory analysis, because of the multiple testing problem (if you do enough tests, you will find something that appears significant, but is actually not).

However, if your goal is to check if there exists any combination of preferences that leads to a higher rate of headache, then there is no way around to test for every combination. Note that in this case you should definitely take into account that you do multiple testing by using a correction method as Bonferroni (https://en.wikipedia.org/wiki/Bonferroni_correction). [Other correction methods might be more powerful, but I am not to deep in this topic] 

I do not think that clustering is promising for the following reasons: 

  • It is not clear, that there are clusters in your data
  • The test results might be very hard to interprete (the subgroups might be not homogenous or you will not recognize them as homogenous)

By the way: in case of a smaller sample size you should consider using the Fisher test (https://en.wikipedia.org/wiki/Fisher%27s_exact_test), which is nonparametric and exact for any sample size. For a bigger sample size, you can use a chi-square test.

What you could also do, is to use a regression method as logit or probit (https://en.wikipedia.org/wiki/Logistic_regression). You then can use the preferences as covariates (decoded with 0s and 1s) and you can add any combinations of preferences. Finally, you will get some nicely interpretable results (including p-values) similar to a linear regression and you can use a model, where you add just combinations that are reasonable to have an influence. The downside is, that you implicitly assume a certain relationship between covariates and headache. Also, if you include many different combinations, you will have a similar issue (some combination might appear significant just by accident).


I just saw that you have an ordinal response variable. So the tests you have proposed are fine. Sorry! If you like to follow the idea of the regression, then you should use an ordinal regression model instead of probit or logit.