Calculate power of the test $H_0$: $\sigma^2 \leq \sigma_0^2$ vs. $H_1$: $\sigma^2 > \sigma_0^2$ for $\mathcal{N}(0, \sigma^2)$ data.

probability theorystatistics

Let $X_1, \dots, X_n \overset{\text{iid}}{\sim} \mathcal{N}(0, \sigma^2)$. We consider the testing problem $H_0$: $\sigma^2 \leq \sigma_0^2$ vs. $H_1$: $\sigma^2 > \sigma_0^2$ and the statistical test $\delta:\mathbb{R}^n \rightarrow \{0, 1\}$ defined as follows:

$$\delta(x) = \begin{cases}
1 &\mbox{if } \ \frac{1}{\sigma_0^2}\sum_{i=1}^n x_i^2 \geq \chi^{2-}_{n, \, 1 – \alpha}\\
0 &\mbox{else}
\end{cases}$$

where $\chi^{2-}_{n, \, 1 – \alpha}$ denotes the $1-\alpha$ quantile of the Chi-squared distribution with $n$ degrees of freedom. (I've heard that this is the uniformly most powerful test for our situation, is that correct?).

We now want to calculate the power of this test. I.e., given $X = (X_1, \dots, X_n)$, we evaluate for $\sigma^2 \in (0, \infty)$:

\begin{align}
\mathbb{E}_{\sigma^2}(\delta(X)) &= \mathbb{P}_{\sigma^2}(\delta(X) = 1)\\
&= \mathbb{P}_{\sigma^2} \left( \frac{1}{\sigma_0^2}\sum_{i=1}^n X_i^2 \geq \chi^{2-}_{n, \, 1 – \alpha} \right)\\
&= \mathbb{P}_{\sigma^2} \left( \frac{1}{\sigma^2}\sum_{i=1}^n X_i^2 \geq \frac{\sigma_0^2}{\sigma^2}\chi^{2-}_{n, \, 1 – \alpha} \right)\\
&= 1 – \mathbb{P}_{\sigma^2} \left( \frac{1}{\sigma^2}\sum_{i=1}^n X_i^2 \leq \frac{\sigma_0^2}{\sigma^2}\chi^{2-}_{n, \, 1 – \alpha} \right)
\end{align}

Note that $\left(\frac{1}{\sigma^2}\sum_{i=1}^n X_i^2 \right) \sim \chi^{2}_{n}$. Hence, when $\sigma^2 = \sigma_0^2$ we have
\begin{align}\mathbb{E}_{\sigma^2}(\delta(X)) &= 1 – \mathbb{P}_{\sigma^2} \left( \frac{1}{\sigma^2}\sum_{i=1}^n X_i^2 \leq \chi^{2-}_{n, \, 1 – \alpha} \right)\\
&= 1 – (1 – \alpha)\\
&= \alpha
\end{align}

My question: can we explicitly calculate $\mathbb{E}_{\sigma^2}(\delta(X))$ when $\sigma^2 \neq \sigma_0^2$?

Best Answer

It seems you have verified that when $H_0$ is true, the rejection rate of a test at the 5% level is 0.05, as it should be.

Yes, it is possible to find the power of the test for a particular alternative value of $\sigma_a^2.$ You can use an analytical method similar to what you did for for the significance level when the null hypothesis is true. (For numerical results you will need software or a suitable printed table of chi-squared distribution.)

I will show results from simulations. In many power computations, simulation is necessary because the distribution of the test statistic when $H_0$ is false is unknown or too messy to handle analytically. However, I hope you will see from my simulations how to find the power analytically for your question.

One test with null hypothesis true. With $H_0: \sigma^2 = 36$ true tested against $H_a: \sigma^2 > 36$ at the 5% level, we do not reject because the P-value exceeds 5%.

For example, given a vector x of $n=100$ observations from $\mathsf{Norm}(\mu = 0, \sigma=6),$ the test would look like this:

set.seed(114)
n = 100;  sg = 6        
x = rnorm(n, 0, sg)           # data
v = mean(x^2);  v             # variance est
[1] 37.91495
h = n*v/36;  h                # test stat
[1] 105.3193
pv = 1 - pchisq(h,100);  pv   # P-val of test
[1] 0.3384785                 # > 5%; don't rej

Many tests. With null hypothesis true, testing at level 5%, the P-value should be less than 5% in 5% of the tests.

set.seed(2022)
pv = replicate(10^5, 
       1-pchisq(sum(rnorm(100,0,6)^2)/36, 100))
mean(pv <= .05)
[1] 0.05044         # aprx significance level
2*sd(pv <= .05)/sqrt(10^5)
[1] 0.001384143     # 95% margin of sim error

So, the significance level of the test is $0.0504 \pm 0.0014;$ essentially 5% as expected. (With 100,000 iterations we can expect about 2-place accuracy for the result.)

Moreover, for an exact tests using a continuous-valued test statistic, the distribution of P-values under $H_0$ is distributed $\mathsf{Unif}(0,1),$ as illustrated by the histogram below. Rejection occurs only for the 5% of instances to the left of the vertical red line.

hist(pv, prob=T, col="skyblue2")
 abline(v = .05, lwd=2, col="red")

enter image description here

Many tests when the null hypothesis is false. By contrast, when $\sigma^2 = 64 > \sigma_0^2 = 36,$ the rejection rate is much higher. We repeat the simulation above with the alternative value $\sigma_a^2 = 64.$

set.seed(2022)
pv = replicate(10^5, 
       1-pchisq(sum(rnorm(100,0,8)^2)/36, 100))
mean(pv <= .05)
[1] 0.99025    # aprx power of the test

The power of the test (rejection probability when the null hypothesis is false in a particular way) is about 99%.

Now, the distribution of the P-value is far from uniform.

hist(pv, prob=T, col="skyblue2")
 abline(v = .05, lwd=2, col="red")

enter image description here

Related Question