Solved – Derive the distribution of the ANOVA F-statistic under the alternative hypothesis

Say we have $k$ samples of data, where sample $i$ is of size $n_i$ and we write it as $x_{i1}, … , x_{in_i}$. Let the total sample size be $N$.

The ANOVA model is $X_{ij} \sim N(\mu_i, \sigma^2)$ independently. The null hypothesis is that the $\mu_i$ are all equal. The alternative hypothesis is that the null hypothesis is not true.

The ANOVA F-statistic is

$$F = \frac{S_2/(k-1)}{S_1/(N – k)},$$

where

$$S_1 = \sum_{i, j}(x_{ij} – \bar{x}_{i\bullet})^2$$

is the within samples sum of squares and

$$S_2 = \sum_in_i(\bar{x}_{i\bullet} – \bar{x}_{\bullet\bullet})^2$$

is the between samples sum of squares.

We know that $S_1$ and $S_2$ are independent and $S_0 = S_1 + S_2$ $(*)$, where

$$S_0 = \sum_{i, j}(x_{ij} – \bar{x}_{\bullet\bullet})^2$$

is the total sum of squares.

It is straightforward to show that under both the null and the alternative hypotheses, $S_1 \sim \sigma^2\chi^2_{N – k}$.

Also, under the null hypothesis, the $X_{ij}$ are identically distributed, and so $S_0 \sim \sigma^2\chi^2_{N-1}$. It follows from $(*)$ that under the null hypothesis $S_2 \sim \sigma^2\chi^2_{k-1}$ and thus $F \sim F_{k-1, N-k}$.

It is claimed that under the alternative hypothesis, $F$ follows a non-central $F$-distribution $F_{k-1, N-k}(\lambda)$, where $\lambda = \sum_in_i(\mu_i – \bar\mu)^2$ and $\bar\mu = \sum_in_i\mu_i/N$ — or equivalently, that $S_2$ follows a (scaled) non-central $\chi^2$ distribution, $\sigma^2\chi^2_{k-1}(\lambda)$.

My tentative approach to proving this is similar to the derivation under the null hypothesis — that is, it is sufficient to prove that $S_0$ follows a (scaled) non-central $\chi^2$ distribution, $\sigma^2\chi^2_{N-1}(\lambda)$.

I've shown that this would follow from a slightly more general statement, namely that if $Y_i \sim N(\mu_i, \sigma^2)$ independently (sample size $N$), then $S_0 = \sum_i(Y_i – \bar{Y})^2 \sim \sigma^2\chi^2_{N-1}(\lambda)$, where $\lambda = \sum_i(\mu_i – \bar\mu)^2$.

Is this the best approach? And what is the simplest proof of the final statement above?

Thanks.

Best Answer

Consider $Y_i \sim N(\mu_i, \sigma^2)$ (independently) as a random vector with a multivariate normal distribution, $\vec Y \sim N(\vec\mu, \sigma^2 I)$.

The main idea is to consider this distribution with reference to a new coordinate system. We choose the first axis to be along $\vec 1 := (1, ..., 1)$. The projection of any vector $\vec y$ onto this axis is $\bar y \vec 1$. In particular the projection of $\vec \mu$ is $\bar \mu \vec 1$. We use this as the origin. (We choose any orthogonal set for the remaining axes.)

We can decompose $\vec Y - \bar \mu \vec 1$ as $(\bar Y \vec 1 - \bar \mu \vec 1) + (\vec Y - \bar Y \vec 1)$, where the first term is the projection along the first axis, and the second term is the projection orthogonal to the first axis.

Thus $|\vec Y - \bar \mu \vec 1|^2 = |\bar Y \vec 1 - \bar \mu \vec 1|^2 + |\vec Y - \bar Y \vec 1|^2$.

It is a fact about independent normal distributions with equal variances that for any choice of orthogonal coordinate system, the axes will still be independent. So the two terms on the right hand side are independent.

And it is easy to show that:

$|\vec Y - \bar \mu \vec 1|^2$ has a non-central $\sigma^2 \chi^2_N(\lambda)$ distribution, with $\lambda = |\vec \mu - \mu \vec 1|^2 = \sum_i(\mu_i - \bar\mu)^2$, and

$|\bar Y \vec 1 - \bar \mu \vec 1|^2$ has a central $\sigma^2 \chi_1^2$ distribution.

Finally, note $|\vec Y - \bar Y \vec 1|^2 =\sum_{i}(Y_i - \bar Y)^2 = S_0$.

It follows from the additive properties of independent non-central $\chi^2$ distributions that we must have $S_0 \sim \sigma^2\chi^2_{N-1}(\lambda)$.

Note: I think this is a specific application of the idea of Cochran's theorem.

Best Answer

Related Solutions

Chi-Squared Distribution – Understanding the Distribution of X’??¹X for X Following a Multivariate t Distribution

Related Question