In the Wikipedia article on ANOVA, it says
In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t-test to more than two groups.
My understanding of this is that ANOVA is the same as t-test when it comes to a two-group comparison.
However, in my simple example below (in R), ANOVA and t-test give similar but slightly different p-values. Can anyone explain why?
x1=rnorm(100,mean=0,sd=1)
x2=rnorm(100,mean=0.5,sd=1)
y1=rnorm(100,mean=0,sd=10)
y2=rnorm(100,mean=0.5,sd=10)
t.test(x1,x2)$p.value # 0.0002695961
t.test(y1,y2)$p.value # 0.8190363
df1=as.data.frame(rbind(cbind(x=x1,type=1), cbind(x2,type=2)))
df2=as.data.frame(rbind(cbind(x=y1,type=1), cbind(y2,type=2)))
anova(lm(x~type,df1))$`Pr(>F)`[1] # 0.0002695578
anova(lm(x~type,df2))$`Pr(>F)`[1] # 0.8190279
Best Answer
By default the argument
var.equal
oft.test()
equalsFALSE
. Inlm()
, the residuals are supposed to have constant variance. Thus, by settingvar.equal = TRUE
int.test()
, you should get the same result.var.equals
indicates whether to treat the two variances as being equal. If TRUE then the pooled variance is used to estimate the variance otherwise the Welch (or Satterthwaite) approximation to the degrees of freedom is used.