I have performed a Kruskal-Wallis test to assess the difference in disease severity (recorded on a scale of 0-10) among different months. Here's the code I used:
kruskal.test(mean_severity~month, data= dat)
Kruskal-Wallis rank sum test
data: mean_severity by month
Kruskal-Wallis chi-squared = 20.172, df = 7, p-value = 0.00521
The obtained p-value indicates a significant difference in disease severity among the months. To further analyze and visualize the differences, I used the ggstatsplot
package with the following code:
ggbetweenstats(data = dat, y=mean_severity , x=month,
type="nonparametric", p.adjust.method = "fdr")
My question is: How can I interpret and report the results from the analysis/plot?
Best Answer
Let's create a reproducible example. I simulate a dataset that has the same structure as yours.
We can run the Kruskal-Wallis test
And use ggbetweenstats to plot the results of a post hoc multiple comparison test:
At the top of the plot, we can see the p-values of Dunn's test for the groups that are statistically different. For some reasons, these are not visualized in the image in question. A line connects every two plots that are statistically different. In this case, for example, the different groups are April and August, April and June, but not April and July.
This is simply a visualization of Dunn's test. To run this test, the package uses the function kwAllPairsDunnTest from the package PMCMRplus with a "Holm" correction for multiple comparisons by default. The table reports the p-values that are represented in the figure. April and August, for example, are different (p-value = 1.3e-05), but April and July are not (p-value = 1.0000).
Roughly speaking, when the boxes in the box plot do not overlap, you can hypothesize a possible statistically significant difference between the groups. However, this is approximate, and a statistical test is required to check differences reliably and rigorously. Moreover, if you have skewed data, like in the case of the figure in question where most of the boxes are squeezed at the bottom of the figure, it can be hard to see overlaps with the naked eye. This is why it is useful to check the top part of the figure provided by ggstatsplot, and/or inspect the output table of Dunn's test.