Solved – Why would I use ANOVA instead of a Rank-Sum test

anovanonparametrict-test

A colleague of mine with little statistics experience is trying to perform an experimental evaluation of a computer program. He created a between subjects design and solicited test subjects.

11 people were given his "new and improved" computer program to use. 10 others got the "old and boring" computer program to use, for the same task.

He asked me, and several other people around the lab how to analyze his data.

I told him he should examine the data for normality. If it was normally distributed, he should use a t-test. If it was not, he should use a Wilcoxon rank sum test.

One of my colleagues told him he should use ANOVA, even though he only has 2 groups. Apparently using ANOVA on non-normal data in R produces a new degree of freedom measure which can be put into a t-test.

I've never heard of such a thing. Is this true? Is it statistically valid? Why would anyone use it instead of just doing a rank-sum test?

Best Answer

Using ANOVA in R does not produce anything different from using ANOVA in another program, and with two groups the results will be equivalent to an equal variance t-test. The t-test is known to be robust to deviations from normality, though with unequal variances a Welch's t-test is probably preferrable.

In the special case of a score based on the number of correct answers on a multiple choice test, the distribution of the score is probably an overdispersed binomial. In that case the "correct" analysis might be a GLM model with a quasi-binomial distribution. Of course, the results might be quite similar to the unequal variance t-test.

Here is a simple simulation based example with 20 questions, and unequal variances. Welch's t-test gives a result much closer to the overdispersed binomial regression.

set.seed(3413)
#generate first sample
p1 <- 1/(1+exp(-1+rnorm(10,sd=1)))
x1 <- rbinom(10, size=20, p=p1)
#generate second sample
p2 <- 1/(1+exp(-3+rnorm(10, sd=1)))
x2 <- rbinom(10, size=20, p=p2)
#combine two sets
x <- c(x1,x2)
g <- gl(2,10)

#summaries:
tapply(x, g, mean)
   1    2 
12.6 19.2 
tapply(x, g, sd)
       1        2 
3.921451 1.032796 

#t-test:
t.test(x ~ g, var.equal=TRUE)

        Two Sample t-test

data:  x by g 
t = -5.1468, df = 18, p-value = 6.765e-05
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 -9.294136 -3.905864 
sample estimates:
mean in group 1 mean in group 2 
           12.6            19.2 


#without equal variances:
t.test(x ~ g, var.equal=FALSE)

        Welch Two Sample t-test

data:  x by g 
t = -5.1468, df = 10.243, p-value = 0.0004016
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 -9.448128 -3.751872 
sample estimates:
mean in group 1 mean in group 2 
           12.6            19.2 


#overdispersed binomial regression:
summary(glm(cbind(x, 20-x) ~ g, family="quasibinomial") )

Call:
glm(formula = cbind(x, 20 - x) ~ g, family = "quasibinomial")

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-2.5340  -0.8386  -0.2199   1.2778   2.7581  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   0.5322     0.2242   2.374 0.028946 *  
g2            2.6458     0.5962   4.438 0.000318 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for quasibinomial family taken to be 2.343713)

    Null deviance: 120.242  on 19  degrees of freedom
Residual deviance:  45.197  on 18  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 5