Variance Homogeneity – How to Test Homogeneity of Variance for Two Groups with Different Sample Sizes

heteroscedasticityrvariance

I have two groups of data that have different sample sizes and in order to be able to analyze both sets they must have the same variance. I was told I should use Bartlett's to test the homogeneity of variance, but when I try to run the test in R it says that the two groups must have the same sample size.

Does Bartlett's test require the groups to have the same sample size?
How was my labmate able to analyze a similar dataset (two groups, different sample sizes) using Bartlett's?
What other test could I use that would show the two groups have similar variances?

Best Answer

I don't know what code you used, but tests do not require equal sample sizes. You can use Levene's test to check for heteroscedasticity. In R, you can use ?leveneTest in the car package:

set.seed(9719)                       # this makes the example exactly reproducible
g1 = rnorm( 50, mean=2, sd=2)        # here I generate data w/ different variances
g2 = rnorm(100, mean=3, sd=3)        #   & different sample sizes
my.data = stack(list(g1=g1, g2=g2))  # getting the data into 'stacked' format

library(car)                         # this package houses the function
leveneTest(values~ind, my.data)      # here I test for heteroscedasticity:
# Levene's Test for Homogeneity of Variance (center = median)
#        Df F value   Pr(>F)   
# group   1  8.4889 0.004128 **
#       148                    
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Levene's test is just a $t$-test ($F$-test) on transformed data. (I discuss tests for heteroscedasticity here: Why Levene test of equality of variances rather than F ratio?) What having unequal sample sizes will do is cause you to have less power to detect a difference. To understand this more fully, it may help to read my answer here: How should one interpret the comparison of means from different sample sizes? Note however, that running a test of your assumptions and then choosing a primary test is not generally recommended (see, e.g., here: A principled method for choosing between t-test or non-parametric e.g. Wilcoxon in small samples). If you are worried that there may be heteroscedasticity, you might do best to simply use a test that won't be susceptible to it, such as the Welch $t$-test, or even the Mann-Whitney $U$-test (which doesn't even require normality). Some information about alternative strategies can be gathered from my answer here: Alternatives to one-way ANOVA for heteroskedastic data.

Related Solutions

Solved – How to these residuals have homogeneity of variances

In my opinion, @Alex's questions are right (+1), the FK test originated for the cross-sectional data, not time series objects. Even in subsamples the data points will be dependent, thus the tests are probably not applicable.

Since heteroscedasticity is a part of some model (you noted these are the residuals, probably from some linear model) you may consider many different alternatives from the lmtest package.

One of the most common options is the Goldfeld-Quandt test gqtest(), however its ad hoc nature (how you choose to split the samples) makes it less attractive than
Breusch-Pagan test bptest() the test performs additional regression of squared residuals on the explanatory variables, and in the presence of significant dependence rejects the homoscedastic null
Another common alternative to the first two tests is a family of White test that are in general presented as LM type of tests comparing original and auxiliary regression models
If some more complicated structures for the variance of residuals is considered, you may also look for very rich family of conditionally heteroscedastic models.

Solved – Fligner-Killeen test of homogeneity of variances interpretation

Comments:

(a) When making stripcharts, variations from the default are sometimes useful for visualizing data. Here are stripcharts for data somewhat similar to yours.

a = 24 + 10*rbeta(150, 1.1, 1.1)  # generate fake data
b = 24 + 10*rbeta(150, 1.1, 1.1)

par(mfrow=c(2,1))                 # enable two panels per plot
  stripchart(x ~ gp, pch="|", ylim=c(.5, 2.5))   # narrow plotting symbol
  stripchart(x ~ gp, meth="j", ylim=c(.5, 2.5))  # jittered to mitigate overplotting
par(mfrow=c(1,1))                 # return to single-panel plotting

(b) I am beginning to wonder whether you have two independent samples or whether you have paired data. The very high P-value from your var.test is suspicious. (In my view, very high P-values are always worth a second look. "If the P-value is very small, reject the null hypothesis; it it is very large, suspect the model or the computation.") Here is what I got for my fake independent data:

var.test(a, b)

        F test to compare two variances

data:  a and b
F = 0.95059, num df = 149, denom df = 149, p-value = 0.7575
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.6886359 1.3121767
sample estimates:
ratio of variances 
         0.9505851 

var.test(x ~ gp)
  [essentially identical output]

Here are fake paired data (effect of pairing perhaps somewhat exaggerated):

err = rnorm(150, 0, .1);  aa = a + err
cor(a, aa)
[1] 0.9992711

You can check for pairing by looking at the correlation and by plotting.

par(mfrow=c(1,2))
 plot(a, b, pch=20, main="Independent");  plot(a, aa, pch=20, main="Paired")
par(mfrow=c(1,1))

For paired data var.test shows P-value near 1 [some output abridged], as in your Question.

var.test(a, aa)

        F test to compare two variances

data:  a and aa
F = 0.99757, num df = 149, denom df = 149, p-value = 0.9882
...

If your data are paired, you should consider the Wilcoxon signed-rank test, instead of the Wilcoxon rank-sum test. If you have further questions, please provide more detail about your data: how collected, purpose of study, and so on. Then perhaps one of us can offer further comments or advice.

Best Answer

Related Solutions

Solved – How to these residuals have homogeneity of variances

Solved – Fligner-Killeen test of homogeneity of variances interpretation

Related Question