Solved – Prove F test is equal to T test squared

anovaf-testmathematical-statisticsself-studyt-test

I need to show that F test is equal to T test squared, when the T test is for 2 independent groups and assuming variances are equal.

I know that $F=\frac{MSB}{MSW}=\frac{SSB/k-1}{SSW/N-K}$
and I know that $T=\frac{X-Y}{S_p \sqrt{\frac{1}{n}+\frac{1}{m}}}$,

so $T^2=\frac{(X-Y)^2}{S_p^2 ({\frac{1}{n}+\frac{1}{m}})}$

I've seen this proof in Regression but here we're not using MSE and MSR, so i'm not sure how to connect between the two.

Best Answer

I have done this proof in my blog
Since I already have the code for the equations, I'm reproducing it here.


We have to prove that

$$F_{a-1, N-a} = \frac{MST}{MSE} = \frac{\frac{SST}{a-1}}{\frac{SSE}{N-a}} \tag{1}$$

reduces to

$$t_{k}^2 = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{S_{p}^2(\frac{1}{n_{1}} + \frac{1}{n_{2}})} \tag{2}$$

$\color{red} {\text{When a = 2}}$ (this is key)


Notation

$SSE$: Sum of Squares due to Error
$SST$: Sum of Squares of Treatment
$MSE$: Mean Sum of squares Error
$MST$: Mean Sum of squares Treatment
$a$: Number of treatments
$n_{1}$: Number of observations in treatment 1
$n_{2}$: Number of observations in treatment 2
$N$: Total number of observations
$\bar{y}_{i.}$: Mean of treatment $i$
$\bar{y}_{..}$: Global mean
$k = N - a$: Degrees of freedom of the denominator of F


Now that we have the formulas, we will work the following:

  1. Denominator of equation (1)
  2. Numerator of equation (1)
    2.a. Part a
    2.b. Part b
    2.c. Part c
  3. Put all together

1. Denominator of equation (1)

When $a = 2$ the denominator of expression $(1)$ is:

$$MSE = \frac{SSE}{N-2} = \frac{\sum_{j=1}^{n_1}{(y_{1j} - \bar{y}_{1.})^2} + \sum_{j=1}^{n_2}{(y_{2j} - \bar{y}_{2.})^2}}{N-2} \tag{3}$$

Recalling that the formula for the sample variance estimator is, $$S_{i}^2 = \frac{\sum_{j=1}^{n_i}(y_{ij} - \bar{y}_{i.})^2}{n_{i} - 1}$$ we can multiply and divide the terms in the numerator in $(3)$ by $(n_{i} - 1)$ and get $(4)$. Don't forget that in this case $N = n_{1} + n_{2}$

$$\frac{SSE}{N-2} = \frac{(n_{1} - 1) S_{1}^2 + (n_{2} - 1) S_{2}^2}{n_{1} + n_{2} - 2} = S_{p}^2 \tag{4}$$

$S_{p}^2$ is called the pooled variance estimator.


2. Numerator of equation (1)

When $a = 2$ the numerator of expression $(1)$ is:

$$\frac{SST}{2-1} = SST$$

and the general expression for SST reduces to $SST = \sum_{1}^2 n_{i} (\bar{y}_{i.} - \bar{y}_{..})^2$ . The next step is to expand the sum as follows:

$$SST = \sum_{1}^2 n_{i} (\bar{y}_{i.} - \bar{y}_{..})^2 = n_{1} (\bar{y}_{1.} - \bar{y}_{..})^2 + n_{2} (\bar{y}_{2.} - \bar{y}_{..})^2 \tag{5}$$

$\bar{y}_{..}$ is called the global mean and we are going to write it in a different way. The new way is:

$$\bar{y}_{..} = \frac{n_{1} \bar{y}_{1.} + n_{2} \bar{y}_{2.}}{N} \tag{6}$$

Next, replace (6) in formula (5) and re-write SST as:

$$SST = \underbrace{n_1 \big[ \bar{y}_{1.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2}_{\text{Part a}} + \underbrace{n_2 \big[ \bar{y}_{2.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2}_{\text{Part b}} \tag{7}$$

The next step is to find alternative ways for the expressions Part a and Part b


2.a. Part a

$$\text{Part a} = n_1 \big[ \bar{y}_{1.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$

Multiply and divide the term with $\bar{y}_{1.}$ by $N$

$$n_1 \big[ \frac{N \bar{y}_{1.}}{N} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$

$N$ is common denominator

$$n_1 \big[\frac{N \bar{y}_{1.} - n_1 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$

$\bar{y}_{1.}$ is common factor of $N$ and $n_1$

$$n_1 \big[\frac{(N - n_1) \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$

Replace $(N - n_{1}) = n_{2}$

$$n_1 \big[\frac{n_2 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$

Now $n_{2}$ is common factor of $\bar{y}_{1.}$ and $\bar{y}_{2.}$

$$n_1 \big[\frac{n_2 (\bar{y}_{1.} - \bar{y}_{2.})}{N} \big]^2$$

Take $n_{2}$ and $N$ out of the square

$$\text{Part a} = \frac{n_{1} n_{2}^2}{N^2} (\bar{y}_{1.} - \bar{y}_{2.})^2$$


2.b. Part b

$$\text{Part b} = n_2 \big[ \bar{y}_{2.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$

Multiply and divide the term with $\bar{y}_{2.}$ by $N$

$$n_2 \big[ \frac{N \bar{y}_{2.}}{N} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$

$N$ is common denominator

$$n_2 \big[\frac{N \bar{y}_{2.} - n_1 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$

$\bar{y}_{2.}$ is common factor of $N$ and $n_2$

$$n_2 \big[\frac{(N - n_2) \bar{y}_{2.} - n_1 \bar{y}_{1.}}{N} \big]^2$$

Replace $(N - n_{2}) = n_{1}$

$$n_2 \big[\frac{n_1 \bar{y}_{2.} - n_1 \bar{y}_{1.}}{N} \big]^2$$

Now $n_{1}$ is common factor of $\bar{y}_{1.}$ and $\bar{y}_{2.}$

$$n_2 \big[\frac{n_1 (\bar{y}_{2.} - \bar{y}_{1.})}{N} \big]^2$$

Take $n_{1}$ and $N$ out of the square

$$\text{Part b} = \frac{n_{2} n_{1}^2}{N^2} (\bar{y}_{2.} - \bar{y}_{1.})^2$$


Now that we have Part a and Part b we are going to go back to equation $(7)$ and replace them:

$$SST = \frac{n_{1} n_{2}^2}{N^2} (\bar{y}_{1.} - \bar{y}_{2.})^2 + \frac{n_{2} n_{1}^2}{N^2} (\bar{y}_{2.} - \bar{y}_{1.})^2 \tag{8}$$

Taking into account that $(\bar{y}_{1.} - \bar{y}_{2.})^2 = (\bar{y}_{2.} - \bar{y}_{1.})^2$, we can re-write equation $(8)$ as $(9)$:

$$SST = \underbrace{\big[ \frac{n_{1} n_{2}^2}{N^2} + \frac{n_{2} n_{1}^2}{N^2} \big]}_{\text{Part c}} (\bar{y}_{1.} - \bar{y}_{2.})^2 \tag{9}$$

This lead us with part Part c, that we are going to work next.


2.c. Part c

$$\text{Part c} = \frac{n_{1} n_{2}^2}{N^2} + \frac{n_{2} n_{1}^2}{N^2}$$

$N^2$ is common denominator and each of the summands has a $n_{1} n_{2}$ factor that we can factor out. Then we have:

$$\frac{n_{1} n_{2} (n_{1} + n_{2})}{N^2}$$

Replace $N = n_{1} + n_{2}$

$$\frac{n_{1} n_{2} N}{N^2}$$

Simplify $N$

$$\frac{n_{1} n_{2}}{N}$$

Re-write the fraction

$$\frac{1}{\frac{N}{n_{1} n_{2}}}$$

Replace $N = n_{1} + n_{2}$

$$\frac{1}{\frac{n_{1} + n_{2}}{n_{1} n_{2}}} = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$

And we have

$$\text{Part c} = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$


Finally, we have to replace this expression for Part c in $(9)$ and re-write SST as:

$$SST = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}} (\bar{y}_{1.} - \bar{y}_{2.})^2$$


3. Put all together

With the previous steps we have shown that, $\color{red} {\text{when a = 2}}$, we have:

$$\frac{SST}{2-1} = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$

and

$$\frac{SSE}{N-2} = S_{p}^2$$

The ratio of these two expressions, namely the F-statistic, is then:

$$F_{1, k} = \frac{\frac{SST}{2-1}}{\frac{SSE}{N-2}} = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{S_{p}^2 \big( \frac{1}{n_{1}} + \frac{1}{n_{2}} \big)} = t_{k}^2$$

And this concludes the proof.

Related Question