Solved – Wilcoxon test for sample n=3

rsample-sizewilcoxon-mann-whitney-testwilcoxon-signed-rank

I have a dataset of male and female bats. Male bats comprise 3 evening and 3 morning trips. Female bats 5 evening and 3 morning trips.
Trips consist of parameters like trip duration, covered distance, farthest point from roost, speed, home range and flying height.
I actually wanted to perform a Wilcoxon test separately for male and female bats to see if differences between evening and morning trips arose by chance or not. Now I was surprised to hear, that Wilcoxon only works with a minimum of n=6. That means I can not do the test with none of my bats. Is that really correct? If not, pleas I would be so grateful if you could refer to a reference, since i am writing my bachelors thesis about that issue.

Also when comparing evening trips between male and female bats using the Mann Withney U test, the sample size requieres to be of a ratio minimum of 4:2. So I can not perform that test for male bats neither?

There is individual bat making a trip in the evening and in the morning. This is called paired. Please correct me if I am wrong. I want to test e.g. the speed of the evening trip with the speed of the same bat during morning trips. I have three individuals.

Best Answer

Well, I can answer part of the question:

The Wilcoxon rank sum test (Mann Whitney U) works for a comparison of $n_1=3$ vs $n_2=3$ just fine.

However for a two-tailed test you can't reasonably set your significance level smaller than 10%, since that's the smallest achievable p-value.

Here's an example done in R:

> x
[1] 0.21 1.70 2.55
> y
[1] 2.58 4.25 3.21
> wilcox.test(x,y)

        Wilcoxon rank sum test

data:  x and y
W = 0, p-value = 0.1
alternative hypothesis: true location shift is not equal to 0

A Wilcoxon signed rank test of 3 pairs also works just fine, but the significance level issue is worse; now your lowest possible two-tailed significance level is 25%. Here's an example:

> wilcox.test(y-x)

        Wilcoxon signed rank test

data:  y - x
V = 6, p-value = 0.25
alternative hypothesis: true location is not equal to 0

So the claim that one or the other test doesn't "work" at those sample sizes isn't true -- but if you want a smaller significance level, that would be a problem for you.

[Whether what you're trying to do/have been advised to do makes sense is less clear from your discussion. More details would help.]

Related Solutions

Solved – Wilcoxon rank sum test in R

The Note in the help on the wilcox.test function clearly explains why R's value is smaller than yours:

Note

The literature is not unanimous about the definitions of the Wilcoxon rank sum and Mann-Whitney tests. The two most common definitions correspond to the sum of the ranks of the first sample with the minimum value subtracted or not: R subtracts and S-PLUS does not, giving a value which is larger by m(m+1)/2 for a first sample of size m. (It seems Wilcoxon's original paper used the unadjusted sum of the ranks but subsequent tables subtracted the minimum.)

That is, the definition R uses is $n_1(n_1+1)/2$ smaller than the version you use, where $n_1$ is the number of observations in the first sample.

As for modifying the result, you could assign the output from wilcox.test into a variable, say a, and then manipulate a$statistic - adding the minimum to its value and changing its name. Then when you print a (e.g. by typing a), it will look the way you want.

To see what I am getting at, try this:

a <- wilcox.test(x,y,correct=FALSE)
str(a)

So for example if you do this:

n1 <- length(x)
a$statistic <- a$statistic + n1*(n1+1)/2
names(a$statistic) <- "T.W"
a

then you get:

        Wilcoxon rank sum test with continuity correction

data:  x and y 
T.W = 156.5, p-value = 0.006768
alternative hypothesis: true location shift is not equal to 0

It's quite common to refer to the rank sum test (whether shifted by $n_1(n_1+1)/2$ or not) as either $W$ or $w$ or some close variant (e.g. here or here). It also often gets called '$U$' because of Mann & Whitney. There's plenty of precedent for using $W$, so for myself I wouldn't bother with the line that changes the name of the statistic, but if it suits you to do so there's no reason why you shouldn't, either.

Solved – Wilcoxon rank test where sample sizes are very different

The short answer is no, as long as a Wilcoxon rank test is appropriate in the first place.

The long answer is that a very different group size affects the power: you will typically have a lot more power with 550 vs. 550 observations than with 100 vs. 1000. If the analysis happens to be underpowered for the kind of effect sizes on one reasonably expect, then this makes it more likely that any apparent findings are false positive and a failure to reject the null hypothesis of no difference between groups is more likely to be a false negative. Additionally, if the numbers were much smaller, then I might start to worry about the discreteness of the distribution of the test statistic etc., but with numbers like 100 vs. 1000 that is really not a concern (unless a lot of observations have the excact same outcome).

Best Answer

Related Solutions

Solved – Wilcoxon rank sum test in R

Solved – Wilcoxon rank test where sample sizes are very different

Related Question