Solved – Understanding the results of Bartlett’s test of homoscedasticity in ANOVA

anovahypothesis testingrvariance

I want to conduct one-way ANOVA for this data:

# three factor levels
I <- c(19, 22, 20, 18, 25, 21, 24, 17)
II <- c(20, 21, 33, 27, 29, 30, 22, 23)
III <- c(16, 15, 18, 26, 17, 23, 20, 19)

# making a dataframe from data
response <- c(I, II, III)
factor <- c(rep("I", length(I)), rep("II", length(II)), rep("III", length(III)))
(data1 <- data.frame(response, factor))

So firstly, I check the boxplot for every factor level:

# making a side-by-side boxplots
plot(response ~ factor, data1)

and see that variance for level II is much higher than for I and II, so I suspect that Bartlett's test will reject the null hypothesis about the equality of variances. enter image description here

I also check the exact value of these variances and see that the second one is significantly different from the others (22,83):

tapply(data1$response, data1$factor, var)
#      I        II       III 
#  7.928571 22.839286 13.642857 

Then I check the normality of response, it's ok:

# testing for normality
qqnorm(data1$response)
qqline(data1$response)

enter image description here

if(shapiro.test(kalkulator$reakcja)$p.value >= 0.01){
   cat("No reason to reject null hypothesis")   
}else {
   cat("This distribution isn't normal")
}
# No reason to reject null hypothesis

So I finally go to Bartlett's test:

# testing for homoscedasticity
bartlett.test(response ~ factor, data1)

# Bartlett test of homogeneity of variances

# data:  response by factor
# Bartlett's K-squared = 1.7932, df = 2, p-value = 0.408

And see that there's no reason to reject null hypothesis. I know of course, that this statement isn't equal to "null hypothesis is true", but I have here significant difference in variances and still this test is passed. Why? And should I assume that there is homogeneity of variances and go on with ANOVA?
Thanks for taking your time 🙂

Best Answer

Statistical tests allow to say if there are significant differences. The fact that you think the variances are significantly different just by computing their values goes against the direction of statistical testing.

In your case, you do a variance calculated from eight points in each group. Thus, the degree of uncertainty is high on the actual value of the variances in each group which leads the Bartlett test to not reject the null hypothesis.

If you had 800 points in each group, the result would be probably different for the same variance values you computed in each group.

Related Question