Solved – Kruskal-Wallis test: assumption testing and interpretation of the results

assumptionsheteroscedasticityinterpretationkruskal-wallis test”nonparametric

There is a chapter on the Kruskal-Wallis (KW) test on the website influentianl points, and there are some quotes I'm not sure I understand correctly:

Quote 1:

Some authors state unambiguously that there are no distributional
assumptions, others that the homogeneity of variances assumption
applies […]
If you wish to compare medians or means, then the Kruskal-Wallis test
also assumes that observations in each group are identically and
independently distributed apart from location. If you can accept
inference in terms of dominance of one distribution over another, then
there are indeed no distributional assumptions.
[link to chapter]

Quote 2:

…heterogeneous variances will make interpretation of the result more
complex…
[link to chapter]

My questions:

For instance, I analyze dataset chickwts which is included in base R software (below I included a boxplot of the data) and, say, it meets all required assumptions. How (in practical terms from biologist's point of view) interpretation of Kruskal-Wallis test results changes, if I carry out the KW test as a test for medians and if I run it as a test for stochastic dominance? What can I conclude from the data in both cases?
From the quote 2 I imply, I should carry out Levene's/Brown-Forsythe test to check for heteroscedasticity. Am I right? If yes, how the result of Levene's test influences the interpretation of Kruskal-Wallis test?
Should I carry out other statistical tests (e.g., Kolmogorov-Smirnov test) or make a special type of plots (e.g., QQ plot for each pair of groups) to check if distributions of data in each group have approximately the same shape?

The dataset:

data(chickwts)
boxplot(weight~feed, data = chickwts, las = 3)

Best Answer

The KW test (also the Mann-Whitney U-test) is essentially always a test for stochastic dominance. What that means is it is testing to see if there exists at least one group such that you would typically get a larger (lesser) value from it than the rest if you drew a value at random from each.

People assume this means that one median or mean must be greater than the other, but that isn't necessarily true. If the shapes and the variances of the distributions are identical (i.e., one group's distribution is just shifted up or down relative to the other), then stochastic dominance implies a greater mean and median (and also a greater third quartile, fifth percentile, etc.). However, if the shapes / variances of the distributions differ, then it isn't necessarily the case. For further discussion of these topics and to see an example where the means are switched, see my answer here: Wilcoxon-Mann-Whitney test giving surprising results. For an example where the medians are equal, but there is nonetheless a stochastically dominant group, consider this:

g1 = c(rep(0, 11), 1:10)                # group 1 has 11 0s, & then 1 to 10
g2 <- g3 <- g4<- c(-10:-1, rep(0, 11))  # the other groups have 11 0s, & -1 to -10
d  = stack(list(g1=g1, g2=g2, g3=g3, g4=g4))
aggregate(values~ind, d, median)        # the median of every group is 0
#   ind values
# 1  g1      0
# 2  g2      0
# 3  g3      0
# 4  g4      0

kruskal.test(values~ind, d)  # the KW test is highly significant nonetheless
#   Kruskal-Wallis rank sum test
# 
# data:  values by ind
# Kruskal-Wallis chi-squared = 28.724, df = 3, p-value = 2.559e-06

With this understanding in mind, we can answer your specific questions.

If the distributions within each group (of chicks) / condition (feed type) have the same shape and variance, a significant KW test implies there is at least one group that is stochastically greater (lesser) than the others, and its mean (and median, and first quartile, and eighty-eighth percentile, etc.) is higher (lower) than the other groups. If the distributions differ in shape and/or variance, a significant KW test implies there is at least one group that is stochastically greater (lesser) than the others, but its mean (and median, and first quartile, and eighty-eighth percentile, etc.) is not necessarily higher (lower) than the other groups.
I would not bother running Levene's test before KW.
I would not bother running the Kolmogorov-Smirnov test before KW. Examining qq-plots seems reasonable.

Related Solutions

Kruskal-Wallis test on data with heterogeneous variance and small sample sizes per group

Kruskal-Wallis is a non-parametric rank-based test. Under its null hypothesis observations in one group are not larger than observations in any other group. This means that Kruskal-Wallis compares medians, not means.

The issue with using Kruskal-Wallis on your data is that it contains 77% zeros and so there are a lot of ties. The p-value has to be corrected for all those ties.

Note: Variance is not an appropriate summary for your data because it consists mostly of zeros and the distribution of the densities is very skewed. Tests that are sensitive to non-normaliity are not appropriate and symmetric confidence intervals as shown in your plot don't make much sense either.

kruskal.test(den ~ Year, data = data)
#> 
#>  Kruskal-Wallis rank sum test
#> 
#> data:  den by Year
#> Kruskal-Wallis chi-squared = 29.435, df = 10, p-value = 0.001059
# p-value adjusted for ties

However, why treat time (in years) as a categorical variable? You have observations from 11 consecutive years. A better plot of your data shows that the proportion of distribution of non-zero data points increases between 2016 and 2018 and then declines again.

This suggests to treat time as continuous and to model its effect with a smooth nonlinear function. Here is an analysis using proportional odds regression with restricted cubic splines as implemented in the rms package.

Note: Proportional odds regression generalizes the Kruskal-Wallis test [1].

library("rms")

anova(orm(den ~ rcs(Year, 4), data = data))
#>                 Wald Statistics          Response: den 
#> 
#>  Factor     Chi-Square d.f. P     
#>  Year       10.04      3    0.0182
#>   Nonlinear  9.26      2    0.0098
#>  TOTAL      10.04      3    0.0182

[1] Biostatistics for Biomedical Research course notes. Available online.

R code to make the small multiples plot above.

data %>%
  mutate(
    den = round(den, 3)
  ) %>%
  ggplot(
    aes(den)
  ) +
  geom_bar(
    width = 0.001
  ) +
  facet_wrap(
    ~Year,
    ncol = 4
  )

Best Answer

Related Solutions

Kruskal-Wallis test on data with heterogeneous variance and small sample sizes per group

Related Question