R – How to Tune `wilcox.test` to Compare Means Instead of Medians?

distributionsinferencerwilcoxon-mann-whitney-test

My question is: how does one tune wilcox.test in R to compare means instead of medians?

Background

According to this site by Laerd Statistics one can use a Wilcoxon Rank-Sum / Mann-Whitney U test for determining if there is a statistically significant difference in the center of two continuous distributions. Specifically, the default behavior of the test is to compare the medians. However, as described by that site in assumption #4, if the two distributions are not 'similarly-shaped', then the test should instead be tuned so as to compare the means instead.

I am on board with all of that.

Now I want to know how to implement these two versions of the test in R.

From my reading of the wilcox.test documentation (here), the default behavior is to compare the medians. However, I do not see any parameter which can be used to alter the behavior so that the means are compared instead. I tried to understand if perhaps I could use the parameter mu to achieve this, but I think that's a dead end.

Example code (if it is helpful)


# Build dataset
bwBySmoke <- data.frame(
    `Birth weight` = c(
        2.18, 2.74, 2.9, 2.27, 2.65, 2.42, 2.23, 2.86, 3.6, 3.65, 3.69, 3.53, 2.38, 2.34, 3.99, 3.89, 3.6, 3.73, 3.31, 3.7, 4.08, 3.61, 3.83, 3.41, 4.13, 3.36, 3.54, 3.51, 2.71
        )
    , `Smoking habit` = c(
        'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Heavy smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers', 'Non-smokers'
        )
    )

# View distributions
library(tidyverse)
library(ggridges)
bwBySmoke %>% 
  ggplot(aes(y = Smoking.habit, x = Birth.weight)) +
  geom_density_ridges(stat = 'binline', bins = 20)

# Do Wilcoxon Rank-Sum / Mann-Whitney U test with default parameters to compare medians
with(bwBySmoke,
    wilcox.test(Birth.weight~Smoking.habit
        ))

# Do Wilcoxon Rank-Sum / Mann-Whitney U test with default parameters to compare means since distributions are not similar with my mystery parameter commented out since I don't think it exists
with(bwBySmoke,
    wilcox.test(Birth.weight~Smoking.habit
        #, mystery_parameter = 'mean'
        ))

If wilcox.test cannot be tuned, is there an alternative test to use?

This was originally posted to Stack Overflow, but I was kindly redirected here and I deleted the other post. The person who redirected me indicated that wilcox.test will not work for me. I am okay with that, but then is there another function (perhaps from a different library?) that will perform a Wilcoxon Rank-Sum / Mann-Whitney U test to compare mean ranks instead of median ranks?

Thanks in advance for any help!

Best Answer

The website you linked does not mention anything about tuning the test or about the test is used to compare means. The Mann-Whitney test always tests the differences between mean ranks, which is written on the website, but mean ranks are not the same as means (ranks are first, second, third, etc.). Mann-Whitney test is never comparing median ranks.

You don't need to "tune the test," the website is only talking about the interpretation of the results. If the distributions are the same, then testing the difference between mean ranks equals to testing the difference between medians, therefore you can interpret the results as a test of a difference between medians, otherwise, you cannot. If you want to test the difference in means, you can use a t-test, which has different assumptions.