Solved – Relation between observed power and p-value

hypothesis testing

I am trying to understand the relation between observed power and p-value in Stephane's reply, which I think is based on J. M. Hoenig and D. M. Heisey (2001) "The Abuse of Power: The Abuse of Power Calculations for Data Analysis," The American Statistician 55(1), 19-24,

enter image description here

I don't quite understand the quote. I hope that someone could state it in a more clear way. Following are what I don't understand.

  1. In the Z test, I was wondering why the parameter value $\delta$ is
    $\delta = \sqrt{n} \mu/\sigma$, but I think a parameter of a
    distribution shouldn't depend on the sample size $n$?
  2. The $100(1-p)$th percentile $Z_p$ of the standard normal
    distribution doesn't depend on the sample $X$, does it? Why is it the observed statistic? Does the observed statistic mean the test statistic of the Z-test?
  3. Why is the cdf $G_\delta$ of the p-value with the parameter value $\delta$ given by $$G_{\delta}(p) = 1 –
    \Phi(Z_p-\delta)?$$
  4. The observed power depends on an observation $X$ of the sample,
    doesn't it? It is defined as the value of the power function at some estimate of the parameter $\delta$ from the sample $X$? Why is the observed power given
    by $$G_{Z_p}(\alpha) = 1- \Phi(Z_{\alpha} – Z_p)?$$
  5. Does the relation between the observed power and the p-value for Z-test still apply for other tests?

Thanks!

Best Answer

Answers to question 1,2,3,4 ($Z$-test)

The decreasing link between the $p$-value and the observed power is intuitively highly expected: the $p$-value $p^{\text{obs}}$ is low when the observed sample mean $\bar y^{\text{obs}}$ is high ($H_1$ favoured), and since $\bar y^{\text{obs}} = \hat\mu$ the observed power is high because the power function $\mu \mapsto \Pr(\text{reject } H_0) $ is increasing.

Below is a mathematical proof.

Assume $n$ independent observations $y^{\text{obs}}_1, \ldots, y^{\text{obs}}_n$ from ${\cal N}(\mu, \sigma^2)$ with known $\sigma$. The $Z$-test consists of rejecting the null hypothesis $H_0:\{\mu=0\}$ in favour of $H_1:\{\mu >0\}$ when the sample mean $\bar y \sim {\cal N}(\mu, {(\sigma/\sqrt{n})}^2)$ is high. Thus the $p$-value is $$p^{\text{obs}}=\Pr({\cal N}(0, {(\sigma/\sqrt{n})}^2) > \bar y^{\text{obs}})=1-\Phi\left(\frac{\sqrt{n}\bar y^{\text{obs}}}{\sigma} \right) \quad (\ast)$$ where $\Phi$ is the cumulative distribution of ${\cal N}(0,1)$.

Thus, choosing a significance level $\alpha$, one rejects $H_0$ when $p^{\text{obs}} \leq \alpha$, and this equivalent to $$\frac{\sqrt{n}\bar y^{\text{obs}}}{\sigma} \geq \Phi^{-1}(1-\alpha)=:z_\alpha.$$ But $\frac{\sqrt{n}\bar y}{\sigma} \sim {\cal N}(\delta,1)$ with $\boxed{\delta=\delta(\mu)=\frac{\sqrt{n}\mu}{\sigma}}$, therefore the probability that the above inequality occurs is $\Pr({\cal N}(\delta,1) \geq z_\alpha) = 1-\Phi(z_\alpha-\delta)$. We have just derived the power function $$\mu \mapsto \Pr(\text{reject } H_0) =1-\Phi(z_\alpha-\delta(\mu))$$ which is, as expected, an increasing function:

alpha <- 5/100
z_alpha <- qnorm(alpha, lower.tail=FALSE)
n <- 6
sigma <- 1
pow <- function(mu){
  delta <- sqrt(n)*mu/sigma
  1-pnorm(z_alpha-delta)
}
curve(pow(x), from=0, to=2, xlab=expression(mu), ylab="Power")

enter image description here

The observed power is the power function evaluated at the estimate $\hat\mu=\bar y^{\text{obs}}$ of the unknown parameter $\mu$. This gives $1-\Phi\left(z_\alpha- \frac{\sqrt{n}\bar y^{\text{obs}}}{\sigma} \right)$, but the formula $(\ast)$ for the $p$-value $p^{\text{obs}}$ shows that $$\frac{\sqrt{n}\bar y^{\text{obs}}}{\sigma}=z_{p^{\text{obs}}}.$$

An answer to question 5 ($F$-tests)

For example, the decreasing one-to-one correspondence between the $p$-value and the observed power also holds for any $F$-test of linear hypotheses in classical Gaussian linear models, and this can be shown as follows. All notations are fixed here. The $p$-value is the probability that a $F$-distribution exceeds the observed statistic $f^{\text{obs}}$. The power only depends on the parameters through the noncentrality parameter $\boxed{\lambda=\frac{{\Vert P_Z \mu\Vert}^2}{\sigma^2}}$, and it is an increasing function of $\lambda$ (noncentral $F$ distributions are stochastically increasing with the noncentrality parameter $\lambda$). The observed power approach consists of evaluating the power at $\lambda=\hat\lambda$ obtained by replacing $\mu$ and $\sigma$ in $\lambda$ with their estimates $\hat\mu$ and $\hat\sigma$. If we use the classical estimates then one has the relation $\boxed{f^{\text{obs}}=\frac{\hat\lambda}{m-\ell}}$. Then it is easy to conclude.
In my reply to Tim's previous question I shared a link to some R code evaluating the observed power as a function of the $p$-value.

Related Question