There are several functions in R to calculate the power of a test, for example, the pwr-functions of the pwr
package.
My question is how we get this function for ANOVA (pwr.anova.test
)?
I know that
\begin{align}
1-\beta &= P(H1|H1) \\
&= P(F>F_{1-\alpha,k-1,N-k}) \\
&= 1-P(F\le F_{1-\alpha,k-1,N-k}) \\
&= 1-P(MQE/MQR\le F_{1-\alpha,k-1,N-k}) \\
~ \\
&= 1-P\bigg(n k \bigg(\frac{^{\sum(\bar{x}_{i.}-\bar{x}_{..})^2}/_{(k-1)}}
{^{\sum\sum(x_{ij}-\bar{x}_{i.})^2}/_{k(n-1)}} –
f^2\!+\!f^2\bigg) \le F_{1-\alpha,k-1,N-k}\bigg) \\
~ \\
~ \\
&= 1-P\bigg(nk\bigg(\frac{^{\sum (\bar{x}_{i.}-\bar{x}_{..})^2}/_{(k-1)}}
{^{\sum\sum(x_{ij}-\bar{x}_{i.}^2)}/_{k(n-1)}} – \frac{^{\sum(µ_i-µ)^2}/_k}{\sigma^2}\bigg)\le F_{1-\alpha,k-1,N-k}-nkf^2\bigg) \\
~ \\
&= ???…??? \\
&= 1-P(F_{k-1,N-k}\le F_{1-\alpha,k-1,N-k}-knf^2) \\
&=
\end{align}
p.body = pf(qf(\alpha, k-1, (n-1)*k, low=F), k-1, (n-1)k, k*n*f^2, low=F)
With $f^2$ the effect;
$knf^2$ the non-centrality parameter; and
$\sigma^2$ the error variance within groups;
But what does the function which has a $F_{k-1,N-k}$ distribution look like? It has to be the (variance between groups/df) / (variance within groups/df)$-knf^2$, but how can I write these so that I see the distribution? In my calculation I did not see it. I have never found the exact way how to get to this formula. Maybe you can help.
Best Answer
Power of F-tests for Gaussian linear models
General $F$-test
Any Gaussian linear model can be written $\boxed{Y=\mu+\sigma G}$ where $G$ has the standard normal distribution on $\mathbb{R}^n$ and $\mu$ is assumed to belong to a linear subspace $W$ of $\mathbb{R}^n$. Usually, the theory of Gaussian linear models treat them with the $Y=X\beta+\sigma G$ writing, corresponding to $W=\text{Im}(X)$. There are good reasons for that but the underlying geometry is clearer with the $\boxed{Y=\mu+\sigma G}$ treatment.
The so-called "ANOVA test" is a particular test of a F-test for a null hypothesis $H_0\colon\{\mu \in U\}$ where $U\subset W$ is a linear subspace. Actually the F-test exactly coincides with the likelihood-ratio test in this situation, and it is based on the Fisher statistic $$ F = \frac{{\Vert P_Z Y\Vert}^2/(m-\ell)}{{\Vert P_W^\perp Y\Vert}^2/(n-m)}, $$ where $Z=U^\perp \cap W$ is the orthogonal complement of $U$ in $W$, and denoting $m=\dim(W)$ and $\ell=\dim(U)$.
Obviously $\boxed{P_Z Y = P_Z \mu + \sigma P_Z G}$ and $\boxed{P_W^\perp Y = \sigma P_W^\perp G}$.
When $H_0\colon\{\mu \in U\}$ is true then $P_Z \mu = 0$ and therefore $$ F = \frac{{\Vert P_Z G\Vert}^2/(m-\ell)}{{\Vert P_W^\perp G\Vert}^2/(n-m)} \sim F_{m-\ell,n-m} $$ has the Fisher $F_{m-\ell,n-m}$ distribution. Consequently, from the classical relation between the Fisher distribution and the Beta distribution, $R^2 \sim {\cal B}(m-\ell, n-m)$.
In the general situation we have to deal with $P_Z Y = P_Z \mu + \sigma P_Z G$ when $P_Z\mu \neq 0$. In this general case one has ${\Vert P_Z Y\Vert}^2 \sim \sigma^2\chi^2_{m-\ell}(\lambda)$, where $\chi^2_{m-\ell}(\lambda)$ is the noncentral $\chi^2$ distribution with $m-\ell$ degrees of freedom and noncentrality parameter $\boxed{\lambda=\frac{{\Vert P_Z \mu\Vert}^2}{\sigma^2}}$, and then $\boxed{F \sim F_{m-\ell,n-m}(\lambda)}$ noncentral Fisher distribution. To compute $P_Z\mu$, note that $P_Z = P_W - P_U$ and $P_W\mu=\mu$, hence the only thing to compute is $P_U \mu$.
Power of the $F$-test
The power of the $F$-test depends on the significance level $\alpha$ which is chosen by the user. The critical value $c$ of the test is the value for which $\Pr_0(F>c)=1-\alpha$ where $\Pr_0$ denotes the probability under the null $H_0$. The power is then $\Pr(F>c)=1-\alpha$ where $\Pr$ denotes the probability under the unknown parameters. As we have seen, $\Pr$ only depends on $\lambda$, $n$, $m$ and $l$.
R code
The effect size in this situation is $\sqrt{\frac{\lambda}{n}}$, and it is more usual to take the effect size rather than the non-centrality parameter $\lambda$ as an input of the power. Thus I have written the following R function which returns the power using the effect size as an argument rather than the noncentrality parameter.
ExampleThe case of ANOVA
The
pwr.anova.test()
function of thepwr
package calculates power of ANOVA for the balanced case only. For example, when there are $3$ groups, $4$ observations per group, the power for $\alpha=5\%$ and for an effect size of $0.5$ isWith the general notations of the $F$-test, here $U$ is a one-dimensional subspace corresponding to the assumptions that all group means are equal. The space $W$ has the same dimension as there are groups, here $3$. Thus using my function one get the power as follows: