$\newcommand{\szdp}[1]{\!\left(#1\right)} \newcommand{\szdb}[1]{\!\left[#1\right]}$
Problem Statement: Let $Y_1,\dots,Y_n$ be a random sample from the probability
density function given by
$$f(y|\theta)=
\begin{cases}
\dfrac1\theta\,m\,y^{m-1}\,e^{-y^m/\theta},&y>0\\
0,&\text{elsewhere}
\end{cases}
$$
with $m$ denoting a known constant.
- Find the uniformly most powerful test for testing
$H_0:\theta=\theta_0$ against $H_a:\theta>\theta_0.$ - If the test in 1. is to have $\theta_0=100, \alpha=0.05,$ and
$\beta=0.05$ when $\theta_a=400,$ find the appropriate sample size and
critical region.
Note 1: This is Problem 10.80 in Mathematical Statistics with Applications, 5th. Ed., by Wackerly, Mendenhall, and Sheaffer.
Note 2: This is cross-posted here.
My Work So Far:
- This is a Weibull distribution. We construct the
likelihood function
$$L(\theta)=\szdp{\frac{m}{\theta}}^{\!\!n}\szdb{\prod_{i=1}^ny_i^{m-1}}
\exp\szdb{-\frac1\theta\sum_{i=1}^ny_i^m}.$$
Now we form the inequality indicated in the Neyman-Pearson Lemma:
\begin{align*}
\frac{L(\theta_0)}{L(\theta_a)}&<k\\
\frac{\displaystyle \szdp{\frac{m}{\theta_0}}^{\!\!n}\prod_{i=1}^ny_i^{m-1}
\exp\szdb{-\frac{1}{\theta_0}\sum_{i=1}^ny_i^m}}
{\displaystyle \szdp{\frac{m}{\theta_a}}^{\!\!n}\prod_{i=1}^ny_i^{m-1}
\exp\szdb{-\frac{1}{\theta_a}\sum_{i=1}^ny_i^m}}&<k\\
\frac{\displaystyle \theta_a^n
\exp\szdb{-\frac{1}{\theta_0}\sum_{i=1}^ny_i^m}}
{\displaystyle \theta_0^n
\exp\szdb{-\frac{1}{\theta_a}\sum_{i=1}^ny_i^m}}&<k\\
\frac{\theta_a^n}{\theta_0^n}\,\exp\szdb{-\frac{\theta_a-\theta_0}
{\theta_0\theta_a}\sum_{i=1}^ny_i^m}&<k\\
n\ln(\theta_a/\theta_0)-\frac{\theta_a-\theta_0}
{\theta_0\theta_a}\sum_{i=1}^ny_i^m&<\ln(k)\\
n\ln(\theta_a/\theta_0)-\ln(k)&<\frac{\theta_a-\theta_0}
{\theta_0\theta_a}\sum_{i=1}^ny_i^m.
\end{align*}
The end result is
$$\sum_{i=1}^ny_i^m>\frac{\theta_0\theta_a}{\theta_a-\theta_0}
\szdb{n\ln(\theta_a/\theta_0)-\ln(k)},$$
or
$$\sum_{i=1}^ny_i^m>k'.$$ - We have to discover the distribution of $\displaystyle \sum_{i=1}^ny_i^m.$
I claim that the random variable $W=Y^m$ is exponentially distributed with
parameter $\theta.$ Proof:
\begin{align*}
f_W(w)
&=f\szdp{w^{1/m}}\frac{dw^{1/m}}{dw}\\
&=\frac{m}{\theta}\,(w^{1/m})^{m-1}\,e^{-w/\theta}\szdp{\frac1m}\,w^{(1/m)-1}\\
&=\frac1\theta\,w^{1-1/m}e^{-w/\theta}\,w^{(1/m)-1}\\
&=\frac1\theta\,e^{-w/\theta},
\end{align*}
which is the distribution of an exponential with parameter $\theta,$ as I
claimed. It follows, then, that $\displaystyle\sum_{i=1}^ny_i^m$ is
$\Gamma(n,\theta)$ distributed, and hence that
$\displaystyle\frac{2}{\theta}\sum_{i=1}^ny_i^m$ is $\chi^2$ distributed with
$2n$ d.o.f. So the RR we can write as that region where
$$\frac{2}{\theta}\sum_{i=1}^ny_i^m>\chi_\alpha^2,$$
with the $2n$ d.o.f. Let
$$U(\theta)=\frac{2}{\theta}\sum_{i=1}^ny_i^m.$$
Then we have
\begin{align*}
\alpha&=P\szdp{U(\theta_0)>\chi_\alpha^2}\\
\beta&=P\szdp{U(\theta_a)<\chi_\beta^2}.
\end{align*}
So now we solve
\begin{align*}
\frac{2}{\theta_0}\sum_{i=1}^ny_i^m&=\chi_\alpha^2\\
\frac{2}{\theta_a}\sum_{i=1}^ny_i^m&=\chi_\beta^2\\
\frac{\chi_\alpha^2\theta_0}{2}&=\frac{\chi_\beta^2\theta_a}{2}\\
\frac{\chi_\alpha^2}{\chi_\beta^2}&=\frac{\theta_a}{\theta_0}.
\end{align*}
So we choose $n$ so that the $\chi^2$ values corresponding to the ratio given
work out. The ratio of $\theta_a/\theta_0=4,$ and we choose $\chi_\alpha^2$ on
the high end, and $\chi_\beta^2$ on the low end so that their ratio is $4,$
by varying $n$. This happens at d.o.f. $13=2n,$ which means we must choose
$n=7.$ For this choice of $n,$ we have the critical region as
$$\frac{2}{\theta_0}\sum_{i=1}^ny_i^m>23.6848.$$
My Question: This is one of the most complicated stats problems I've encountered yet in this textbook, and I just want to know if my solution is correct. I feel like I'm "out on a limb" with complex reasoning depending on complex reasoning. I'm fairly confident that part 1 is correct, but what about part 2?
Best Answer
You've got $L=\frac{\theta_a^n}{\theta_0^n}\,exp\left({-\frac{\theta_a-\theta_0} {\theta_0\theta_a}\sum_{i=1}^ny_i^m}\right)<k$, which is good. Now we take $log$ of both sides:
$$n log\left(\frac{\theta_a}{\theta_0}\right)+\left(\frac{\theta_0-\theta_a}{\theta_0\theta_a}\right)\sum_{i=1}^{n}{y_i^m} < log(k)$$
and so the test itself is in the form: $$\left\{ \sum_{i=1}^{n}{y_i^m} < c \right\}$$
(rejecting if $\sum_{i=1}^{n}{y_i^m} > c$).
Now, for part (b), there's something to note here: $y^m$ has an exponential distribution, and so the $\sum{y^m_i}\sim \Gamma(n,\theta)$. Under the null we get that $\frac{2\sum_{i=1}^{n}{y_i^m}}{\theta_0} > \frac{2c}{\theta_0}$ has a $\chi^2$ distribution with $2n$ degrees of freedom (look for the relation between gamma and chi-squared).
Now let's solve (b):
$$\theta_0=100,\theta_a=400,\alpha=0.05,\beta=0.05$$
When $H_0$ is true, we get $\alpha$ using:
$$\alpha=P\left(\frac{2\sum_{i=1}^{n}{y_i^m}}{100} > \chi^2_{0.05}\right)=0.05.$$
When $H_a$ is true, we get $\beta$ using:
$$\beta=P\left(\frac{2\sum_{i=1}^{n}{y_i^m}}{100} \le \chi^2_{0.05} \middle| \theta=400\right)=P\left(\frac{2\sum_{i=1}^{n}{y_i^m}}{400} \le \frac{1}{4}\chi^2_{0.05} \middle| \theta=400\right)=P\left(\chi^2\le\frac{1}{4}\chi^2_{0.05}\right)=0.05$$
So, we need to find the row in $\chi^2$ table where $\frac{1}{4}\chi^2_{0.05}=\chi^2_{0.95}$:
You can see that for $12$ degrees of freedom, $\chi^2_{0.95}=5.226$ and $\chi^2_{0.05}=21.03$, which is the closest we get for achieving $\frac{1}{4}\chi^2_{0.05}=\chi^2_{0.95}$. Recall that this has $2n$ degrees of freedom, so the appropriate sample size is $6$.