I need to show that F test is equal to T test squared, when the T test is for 2 independent groups and assuming variances are equal.
I know that $F=\frac{MSB}{MSW}=\frac{SSB/k-1}{SSW/N-K}$
and I know that $T=\frac{X-Y}{S_p \sqrt{\frac{1}{n}+\frac{1}{m}}}$,
so $T^2=\frac{(X-Y)^2}{S_p^2 ({\frac{1}{n}+\frac{1}{m}})}$
I've seen this proof in Regression but here we're not using MSE and MSR, so i'm not sure how to connect between the two.
Best Answer
I have done this proof in my blog
Since I already have the code for the equations, I'm reproducing it here.
We have to prove that
$$F_{a-1, N-a} = \frac{MST}{MSE} = \frac{\frac{SST}{a-1}}{\frac{SSE}{N-a}} \tag{1}$$
reduces to
$$t_{k}^2 = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{S_{p}^2(\frac{1}{n_{1}} + \frac{1}{n_{2}})} \tag{2}$$
$\color{red} {\text{When a = 2}}$ (this is key)
Notation
$SSE$: Sum of Squares due to Error
$SST$: Sum of Squares of Treatment
$MSE$: Mean Sum of squares Error
$MST$: Mean Sum of squares Treatment
$a$: Number of treatments
$n_{1}$: Number of observations in treatment 1
$n_{2}$: Number of observations in treatment 2
$N$: Total number of observations
$\bar{y}_{i.}$: Mean of treatment $i$
$\bar{y}_{..}$: Global mean
$k = N - a$: Degrees of freedom of the denominator of F
Now that we have the formulas, we will work the following:
2.a. Part a
2.b. Part b
2.c. Part c
1. Denominator of equation (1)
When $a = 2$ the denominator of expression $(1)$ is:
$$MSE = \frac{SSE}{N-2} = \frac{\sum_{j=1}^{n_1}{(y_{1j} - \bar{y}_{1.})^2} + \sum_{j=1}^{n_2}{(y_{2j} - \bar{y}_{2.})^2}}{N-2} \tag{3}$$
Recalling that the formula for the sample variance estimator is, $$S_{i}^2 = \frac{\sum_{j=1}^{n_i}(y_{ij} - \bar{y}_{i.})^2}{n_{i} - 1}$$ we can multiply and divide the terms in the numerator in $(3)$ by $(n_{i} - 1)$ and get $(4)$. Don't forget that in this case $N = n_{1} + n_{2}$
$$\frac{SSE}{N-2} = \frac{(n_{1} - 1) S_{1}^2 + (n_{2} - 1) S_{2}^2}{n_{1} + n_{2} - 2} = S_{p}^2 \tag{4}$$
$S_{p}^2$ is called the pooled variance estimator.
2. Numerator of equation (1)
When $a = 2$ the numerator of expression $(1)$ is:
$$\frac{SST}{2-1} = SST$$
and the general expression for SST reduces to $SST = \sum_{1}^2 n_{i} (\bar{y}_{i.} - \bar{y}_{..})^2$ . The next step is to expand the sum as follows:
$$SST = \sum_{1}^2 n_{i} (\bar{y}_{i.} - \bar{y}_{..})^2 = n_{1} (\bar{y}_{1.} - \bar{y}_{..})^2 + n_{2} (\bar{y}_{2.} - \bar{y}_{..})^2 \tag{5}$$
$\bar{y}_{..}$ is called the global mean and we are going to write it in a different way. The new way is:
$$\bar{y}_{..} = \frac{n_{1} \bar{y}_{1.} + n_{2} \bar{y}_{2.}}{N} \tag{6}$$
Next, replace (6) in formula (5) and re-write SST as:
$$SST = \underbrace{n_1 \big[ \bar{y}_{1.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2}_{\text{Part a}} + \underbrace{n_2 \big[ \bar{y}_{2.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2}_{\text{Part b}} \tag{7}$$
The next step is to find alternative ways for the expressions Part a and Part b
2.a. Part a
$$\text{Part a} = n_1 \big[ \bar{y}_{1.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$
Multiply and divide the term with $\bar{y}_{1.}$ by $N$
$$n_1 \big[ \frac{N \bar{y}_{1.}}{N} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$
$N$ is common denominator
$$n_1 \big[\frac{N \bar{y}_{1.} - n_1 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$
$\bar{y}_{1.}$ is common factor of $N$ and $n_1$
$$n_1 \big[\frac{(N - n_1) \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$
Replace $(N - n_{1}) = n_{2}$
$$n_1 \big[\frac{n_2 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$
Now $n_{2}$ is common factor of $\bar{y}_{1.}$ and $\bar{y}_{2.}$
$$n_1 \big[\frac{n_2 (\bar{y}_{1.} - \bar{y}_{2.})}{N} \big]^2$$
Take $n_{2}$ and $N$ out of the square
$$\text{Part a} = \frac{n_{1} n_{2}^2}{N^2} (\bar{y}_{1.} - \bar{y}_{2.})^2$$
2.b. Part b
$$\text{Part b} = n_2 \big[ \bar{y}_{2.} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$
Multiply and divide the term with $\bar{y}_{2.}$ by $N$
$$n_2 \big[ \frac{N \bar{y}_{2.}}{N} - (\frac{n_1 \bar{y}_{1.} + n_2 \bar{y}_{2.}}{N}) \big]^2$$
$N$ is common denominator
$$n_2 \big[\frac{N \bar{y}_{2.} - n_1 \bar{y}_{1.} - n_2 \bar{y}_{2.}}{N} \big]^2$$
$\bar{y}_{2.}$ is common factor of $N$ and $n_2$
$$n_2 \big[\frac{(N - n_2) \bar{y}_{2.} - n_1 \bar{y}_{1.}}{N} \big]^2$$
Replace $(N - n_{2}) = n_{1}$
$$n_2 \big[\frac{n_1 \bar{y}_{2.} - n_1 \bar{y}_{1.}}{N} \big]^2$$
Now $n_{1}$ is common factor of $\bar{y}_{1.}$ and $\bar{y}_{2.}$
$$n_2 \big[\frac{n_1 (\bar{y}_{2.} - \bar{y}_{1.})}{N} \big]^2$$
Take $n_{1}$ and $N$ out of the square
$$\text{Part b} = \frac{n_{2} n_{1}^2}{N^2} (\bar{y}_{2.} - \bar{y}_{1.})^2$$
Now that we have Part a and Part b we are going to go back to equation $(7)$ and replace them:
$$SST = \frac{n_{1} n_{2}^2}{N^2} (\bar{y}_{1.} - \bar{y}_{2.})^2 + \frac{n_{2} n_{1}^2}{N^2} (\bar{y}_{2.} - \bar{y}_{1.})^2 \tag{8}$$
Taking into account that $(\bar{y}_{1.} - \bar{y}_{2.})^2 = (\bar{y}_{2.} - \bar{y}_{1.})^2$, we can re-write equation $(8)$ as $(9)$:
$$SST = \underbrace{\big[ \frac{n_{1} n_{2}^2}{N^2} + \frac{n_{2} n_{1}^2}{N^2} \big]}_{\text{Part c}} (\bar{y}_{1.} - \bar{y}_{2.})^2 \tag{9}$$
This lead us with part Part c, that we are going to work next.
2.c. Part c
$$\text{Part c} = \frac{n_{1} n_{2}^2}{N^2} + \frac{n_{2} n_{1}^2}{N^2}$$
$N^2$ is common denominator and each of the summands has a $n_{1} n_{2}$ factor that we can factor out. Then we have:
$$\frac{n_{1} n_{2} (n_{1} + n_{2})}{N^2}$$
Replace $N = n_{1} + n_{2}$
$$\frac{n_{1} n_{2} N}{N^2}$$
Simplify $N$
$$\frac{n_{1} n_{2}}{N}$$
Re-write the fraction
$$\frac{1}{\frac{N}{n_{1} n_{2}}}$$
Replace $N = n_{1} + n_{2}$
$$\frac{1}{\frac{n_{1} + n_{2}}{n_{1} n_{2}}} = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$
And we have
$$\text{Part c} = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$
Finally, we have to replace this expression for Part c in $(9)$ and re-write SST as:
$$SST = \frac{1}{\frac{1}{n_{1}} + \frac{1}{n_{2}}} (\bar{y}_{1.} - \bar{y}_{2.})^2$$
3. Put all together
With the previous steps we have shown that, $\color{red} {\text{when a = 2}}$, we have:
$$\frac{SST}{2-1} = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{\frac{1}{n_{1}} + \frac{1}{n_{2}}}$$
and
$$\frac{SSE}{N-2} = S_{p}^2$$
The ratio of these two expressions, namely the F-statistic, is then:
$$F_{1, k} = \frac{\frac{SST}{2-1}}{\frac{SSE}{N-2}} = \frac{(\bar{y}_{1.} - \bar{y}_{2.})^2}{S_{p}^2 \big( \frac{1}{n_{1}} + \frac{1}{n_{2}} \big)} = t_{k}^2$$
And this concludes the proof.