Solved – Non-normal distribution even with Kruskal-Wallis test

kruskal-wallis test”nonparametric

Thanks to anyone who can help!
I`ve done a survey with Likert-scale questions with about 1,500 participants. Initial analysis in SPSS showed no outliers (using box-plot) but scores were not normally distributed for any of the factors, as assessed by Shapiro-Wilk's test (p < .001). Since the independent variable had 5 groups, I ran a Kruskal-Wallis H test, finding significant results for the tested variables (p < .001 for most), but an inspection of the boxplots that accompany these new results also showed that he shapes of the distributions are not similar (i.e. the medians were sometimes higher or lower than other variables being compared etc). My questions:

Is this a serious flaw? Can I use these results? If so, should I mention that in at least one independent variable, the groups displayed non-normal distributions?

thanks!

Best Answer

This is not a serious or fatal flaw. Here's why.

In its most general form the Kruskal-Wallis test is a test for stochastic dominance among $k$ groups. That is, the null hypothesis is:

H$_{0}\text{: P}\left(X_{i} > X_{j}\right) =0.5$

for all groups $i$ and $j$ from $1$ to $k$, with the alternative hypothesis:

H$_{\text{A}}\text{: P}\left(X_{i} > X_{j}\right) \ne 0.5$

for at least one group $i \ne j$. Putting the null hypothesis into plain language, no stochastic dominance would mean that a randomly drawn observation from group $i$ is just as likely to be larger than to be smaller than a randomly drawn observation from group $j$, for any two groups. Putting the alternative hypothesis into plain language, the existence of stochastic dominance would mean that a randomly drawn observation from at least one group $i$ is more likely to be larger than to be smaller than a randomly drawn observation from a different group $j$.

These hypotheses make no assumptions about the shape, width or location of the distributions of the groups being compared other than within each group observations are independently and identically distributed (i.e. distributed i.i.d.).

Sometimes researchers want to use these tests (and other nonparametric tests) as tests for median difference (instead of tests for stochastic dominance). However, in order to interpret test results this way one has to make the additional assumptions that the shapes and widths of the distributions of all groups are the same except for location. If you must provide inference about median difference and one of your groups has a differently shaped distribution than another, then yes, this test is fatally flawed. But stochastic dominance is often a good enough kind of inference to make.