Hypothesis Testing – Conducting an F Test for Equality of Variances

f-testhypothesis testingself-studyvariance

I know that the test statistic is $$F=S_1^2/S_2^2 $$

But I am looking at some example questions from my lecturer and some have confused me. For example:

For a certain game, individual game scores are normally distributed. Two players played 10 games each, and recorded their scores on each game. For player A, the average score is 375 and the sample variance is 17312. For player B, the average score is 360 and the sample variance is 13208.

Test, at the 5% level the hypothesis that the variances ofthe two players' scores are the same assuming that the true means are unknown.

And he uses the equation $$F=(17312/9)/(13208/9) $$ Obviously the solution is the same here, but in other examples I have looked at (which I can't find now) the ns do not cancel so it is not. How do I know when to use which equation?

Best Answer

There appears to be a difference in the interpretation of a statistical formula. One quick, simple, and compelling way to resolve such differences is to simulate the situation. Here, you have noted there will be a difference when the players play different numbers of games. Let's therefore retain every aspect of the question but change the number of games played by the second player. We will run a large number ($10^5$) of iterations, collecting the two versions of the $F$ statistic in each case, and draw histograms of their results. Overplotting these histograms with the $F$ distribution ought to determine, without any further debate, which formula (if any!) is correct.

Here is R code to do this. It takes only a couple of seconds to execute.

s <- sqrt((9 * 17312 + 9*13208) / (9 + 9))             # Common SD
m <- 375                                               # Common mean
n.sim <- 10^5                                          # Number of iterations
n1 <- 10                                               # Games played by player 1
n2 <- 3                                                # Games played by player 2
x <- matrix(rnorm(n1*n.sim, mean=m, sd=s), ncol=n.sim) # Player 1's results
y <- matrix(rnorm(n2*n.sim, mean=m, sd=s), ncol=n.sim) # Player 2's results
F.sim <- apply(x, 2, var) / apply(y, 2, var)           # S1^2/S2^2

par(mfrow=c(1,2))                                      # Show both histograms
#
# On the left: histogram of the S1^2/S2^2 results.
#
hist(log(F.sim), probability=TRUE, breaks=50, main="S1^2/S2^2")
curve(df(exp(x),n1-1,n2-1)*exp(x), add=TRUE, from=log(min(F.sim)),
   to=log(max(F.sim)), col="Red", lwd=2)
#
# On the right: histogram of the (S1^2/(n1-1)) / (S2^2/(n2-1)) results.
#
F.sim2 <- F.sim * (n2-1) / (n1-1)
hist(log(F.sim2), probability=TRUE, breaks=50, main="(S1^2/[n1-1])/(S2^2/[n2-1])")
curve(df(exp(x),n1-1,n2-1)*exp(x), add=TRUE, from=log(min(F.sim)),
   to=log(max(F.sim)), col="Red", lwd=2)

Although it is unnecessary, this code uses the common mean ($375$) and pooled standard deviation (computed as s in the first line) for the simulation. Also of note is that the histograms are drawn on logarithmic scales, because when the numbers of games get small (n2, equal to $3$ here), the $F$ distribution can be extremely skewed.

Here is the output. Which formula actually matches the $F$ distribution (the red curve)?

(The difference in the right hand side is so dramatic that even just $100$ iterations would suffice to show its formula has serious problems. Thus in the future you probably won't need to run $10^5$ iterations; one-tenth as many will usually do fine.)

If you like, modify this to fit some of the other examples you have looked at.

Related Solutions

Hypothesis Testing – Why Use n-1 Instead of n in Pooled Sample Variance?

For a two-sample t test on samples from populations with the same variance $\sigma^2,$ you have two proposed variance estimates

$$ S_p^2 = \frac{(n_1 - 1)S^2_1+(n_2-1)S_2^2}{n_1+n_2-2},$$

and

$$ S_a^2 = \frac{(n_1S^2_1+n_2)S^2_2}{n_1+n_2}. $$

For $S_p^2,$ you have found $S_i^2; i=1,2,$ each of which requires computing a sample mean $\bar X_i, 1,2.$ So,

$$ \frac{\nu S_p^2}{\sigma^2} \sim \mathsf{Chisq(\nu)}.$$ where $\nu = n_1+n_2 - 2.$

For $S_a^2,$ the distribution theory is not so clear. You say something about $S_a^2$ being unbiased, but that hardly specifies a distribution. Let's use The same degrees of freedom $\nu$ as above for an experiment.

Simulation: Begin by looking at $m = 10\,000$ samples x1 of size $n_1 = 2$ from $\mathsf{Norm}(\mu_1 = 100, \sigma_1 = 15)$ and x2 of size $n_2=3$ from $\mathsf{Norm}(\mu_2 = 110, \sigma_2 = 15).$
We find the sample variances, the pooled variance estimat and the average variance estimate. Then we look at the corresponding chi-squared random variables.

set.seed(2022)
n1 = 2; m=10^5
M1 = matrix(rnorm(n1*m, 100, 15), nrow=m)
v1 = apply(M1, 1, var)
n2 = 3
M2 = matrix(rnorm(n2*m, 110, 15), nrow=m)

v2 = apply(M2, 1, var)

pool = (v1 + 2*v2)/(n1+n2-2)
q.p = (n1+n2-2)*pool/15^2
avg.v = (v1+v2)/(n1+n2) ####
q.a = (n1+n2)*avg.v/15^2

Then we compare the results with the density functions of the corresponding chi-squared distribution. For the pooled estimate $S_p^2$ we get a good match, but for $S_a^2$ the fit is not good.

R code for graphs:

par(mfrow=c(1,2))
 hist(q.p, prob=T, ylim=c(0,.35), col="skyblue2", main="Pooled")
  curve(dchisq(x, n1+n2-2), add=T, lwd=2, col="orange")

 hist(q.a, prob=T, ylim=c(0,.35), col="skyblue2", main="Averaged")
  curve(dchisq(x, n1+n2-1), add=T, lwd=2, col="orange")
par(mfrow=c(1,1))

Best Answer

Related Solutions

Hypothesis Testing – Why Use n-1 Instead of n in Pooled Sample Variance?

Related Question