Maximum Likelihood – How to Find the Unique MVUE

maximum likelihoodorder-statisticsself-studysufficient-statisticsumvue

This question is from Robert Hogg's Introduction to Mathematical Statistics 6th Version problem 7.4.9 at page 388.

Let $X_1,…,X_n$ be iid with pdf $f(x;\theta)=1/3\theta,-\theta<x<2\theta,$ zero elsewhere, where $\theta>0$.

(a) Find the mle $\hat{\theta}$ of $\theta$

(b) Is $\hat{\theta}$ a sufficient statistics for $\theta$ ? Why ?

I think I can solve (a) and (b), but I am confused by (c).

For (a):

Let $Y_1<Y_2<…Y_n$ be the order statistics.

$L(\theta;x)=\frac{1}{3\theta}\times\frac{1}{3\theta}\times…\times\frac{1}{3\theta}=\frac{1}{(3\theta)^n}$ when $-\theta< y_1$ and $y_n < 2\theta$;elsewhere $L(\theta;x)=0$

$\frac{dL(\theta;x)}{d\theta}=-n(3\theta)^{n-1}$, since $\theta>0$, we can see this derivative is negative,

so likelihood function $L(\theta;x)$ is decreasing.

From $(-\theta< y_1 $ and $ y_n < 2\theta)$, $\Rightarrow$ $(\theta>-y_1 $ and $\theta>y_n/2), \Rightarrow \theta>max(-y_1,y_n/2)$

$L(\theta,x)$ is decreasing, so when $\theta$ has the samllest value the likelihood function will achieve maximum, since $\theta>max(-y_1,y_n/2)$, when $\theta=max(-y1,y_n/2)$, the likelihood function will achieve the maximum value.

$\therefore$ mle $\hat{\theta}=max(-y_1,y_n/2)$

For (b):

$f(x_1;\theta)f(x_2;\theta)…f(x_n;\theta)=\frac{1}{(3\theta)^n}\prod_{i}^{n} I(-\theta<x_i<2\theta)=\frac{1}{(3\theta)^n}I(max(x_i)<2\theta)\times 1$

$\therefore$ By factorization theorem of Neyman, $y_n=max(x_i)$ is a sufficient statistic for $\theta$. Therefore, $y_n/2$ is also a sufficient statisitc

Samely,

$f(x_1;\theta)f(x_2;\theta)…f(x_n;\theta)=\frac{1}{(3\theta)^n}\prod_{i}^{n} I(-\theta<x_i<2\theta)=\frac{1}{(3\theta)^n}I(min(x_i)>-\theta)\times 1$

$\therefore$ By factorization theorem of Neyman, $y_1=min(x_i)$ is a sufficient statistic for $\theta$. Therefore, $-y_1$ is also a sufficient statisitc.

For (c):

First, we find the CDF of $X$

$F(x)=\int_{-\theta}^{x}\frac{1}{3\theta}dt=\frac{x+\theta}{3\theta},-\theta<x<2\theta$

Next, we can find pdf for both $Y_1$ and $Y_n$ from the formula of the book for the order statistics.

$f(y_1)=\frac{n!}{(1-1)!(n-1)!}[F(y_1)]^{1-1}[1-F(y_1)]^{n-1}f(y_1)=n[1-\frac{y_1+\theta}{3\theta}]^{n-1}\frac{1}{3\theta}=n\frac{1}{(3\theta)^n}(2\theta-y_1)^{n-1}$

Samely,

$f(y_n)=n(\frac{y_n+\theta}{3\theta})^{n-1}\frac{1}{3\theta}=n\frac{1}{(3\theta)^n}(y_n+\theta)^{n-1}$

Next, we show the completeness of family of pdf for $f(y_1)$ and $f(y_n)$

$E[u(Y_1)]=\int_{-\theta}^{2\theta}u(y_1)n\frac{1}{(3\theta)^n}(2\theta-y_1)^{n-1}dy_1=0 \Rightarrow \int_{-\theta}^{2\theta}u(y_1)(2\theta-y_1)dy_1=0$. By $FTC$ (derivate the integral) we can show $u(\theta)=0$ for all $\theta>0$.

Therefore, family of pdf $Y_1$ is complete..

Samely, still by $FTC$, we can show that family of pdf $Y_n$ is complete.

The problem is now we need to show that $\frac{(n+1)\hat{\theta}}{n}$ is unbiased.

When $\hat{\theta}=-y_1$

$E(-y_1)=\int_{-\theta}^{2\theta}(-y_1)\frac{n}{(3\theta)^n}(2\theta-y_1)^{n-1}dy_1=\frac{1}{(3\theta)^n}\int_{-\theta}^{2\theta}y_1d(2\theta-y_1)^n$

We can solve the integral by intergrate by parts

$E(-y_1)=\frac{1}{(3\theta)^n}[y_1(2\theta-y_1)^n\mid_{-\theta}^{2\theta}-\int_{-\theta}^{2\theta}(2\theta-y_1)^ndy_1]=\frac{1}{(3\theta)^n}[\theta (3\theta)^n-\frac{(3\theta)^{n+1}}{n+1}]=\theta-\frac{3\theta}{n+1}=\frac{(n-2)\theta}{n+1}$

$\therefore E(\frac{(n+1)\hat{\theta}}{n})=\frac{n+1}{n}E(-y_1)=\frac{n+1}{n}\frac{(n-2)\theta}{n+1}=\frac{n-2}{n}\theta$

Therefore, $\frac{(n+1)\hat{\theta}}{n}$ is not an unbiased estimator of $\theta$ when $\hat{\theta}=-y_1$

When $\hat{\theta}=y_n/2$

$E(Y_n)=\int_{-\theta}^{2\theta}y_n\frac{n}{(3\theta)^n}(y_n+\theta)^{n-1}dy_n=\frac{1}{(3\theta)^n}\int_{-\theta}^{2\theta}y_nd(y_n+\theta)^n=\frac{1}{(3\theta)^n}[y_n(y_n+\theta)^n\mid_{-\theta}^{2\theta}-\int_{-\theta}^{2\theta}(y_n+\theta)^ndy_n]=\frac{1}{(3\theta)^n}[2\theta(3\theta)^-\frac{(3\theta)^{n+1}}{n+1}]=2\theta-\frac{3\theta}{n+1}=\frac{2n-1}{n+1}\theta$

$\therefore E(\frac{(n+1)\hat{\theta}}{n})=\frac{n+1}{n}E(Y_n/2)=\frac{n+1}{2n}E(Y_n)=\frac{n+1}{2n}\frac{2n-1}{n+1}\theta=\frac{2n-1}{2n}\theta$

Still, $\frac{(n+1)\hat{\theta}}{n}$ is not an unbiased estimator of $\theta$ when $\hat{\theta}=y_n/2$

But the book's answer is that $\frac{(n+1)\hat{\theta}}{n}$ is an unique MVUE. I don't understant why it is a MVUE if it is a biased estimator.

Or my clculations are wrong, please help me to find the mistakes, I can give you more detailed calculations.

Thank you very much.

Best Answer

Working with extrema requires care, but it doesn't have to be difficult. The crucial question, found near the middle of the post, is

... we need to show that $\frac{n+1}{n}{\hat \theta}^{n}$ is unbiased.

Earlier you obtained

$$\hat\theta = \max(-y_1, y_n/2) = \max\{-\min\{y_i\}, \max\{y_i\}/2\}.$$

Although that looks messy, the calculations become elementary when you consider the cumulative distribution function $F$. To get started with this, note that $0\le \hat\theta \le \theta$. Let $t$ be a number in this range. By definition,

$$\eqalign{ F(t) &= \Pr(\hat\theta\le t) \\&= \Pr(-y_1 \lt t\text{ and }y_n/2 \le t) \\ &= \Pr(-t \le y_1 \le y_2 \cdots \le y_n \le 2t). }$$

This is the chance that all $n$ values lie between $-t$ and $2t$. Those values bound an interval of length $3t$. Because the distribution is uniform, the probability that any specific $y_i$ lies in this interval is proportional to its length:

$$\Pr(y_i \in [-t, 2t]) = \frac{3t}{3\theta} = \frac{t}{\theta}.$$

Because the $y_i$ are independent, these probabilities multiply, giving

$$F(t) = \left(\frac{t}{\theta}\right)^n.$$

The expectation can immediately be found by integrating the survival function $1-F$ over the interval of possible values for $\hat\theta$, $[0, \theta]$, using $y=t/\theta$ for the variable:

$$\mathbb{E}(\hat\theta) = \int_0^\theta \left(1 - \left(\frac{t}{\theta}\right)^n\right)dt = \int_0^1 (1-y^n)\theta dy = \frac{n}{n+1}\theta.$$

(This formula for the expectation is derived from the usual integral via integration by parts. Details are given at the end of https://stats.stackexchange.com/a/105464.)

Rescaling by $(n+1)/n$ gives

$$\mathbb{E}\left(\frac{n+1}{n}\,{\hat \theta}\right) = \theta,$$

QED.

Related Solutions

Order Statistics Distribution – Understanding Distribution of Sum of Order Statistics

Since$$(y_1,\ldots,y_r)\sim\frac{n!\theta^{-r}}{(n-r)!}e^{-\frac{1}{\theta}[\sum_{i=1}^{r}y_i+(n-r)y_r]}\mathbb{I}_{y_\le y_2\le \ldots \le y_r}$$you have the joint pdf of $(y_1,\ldots,y_r)$. From there, you can deduce the pdf of $$s_r=\sum_{i=1}^{r}y_i+(n-r)y_r\,.$$Indeed, because the Jacobian of the transform is constant,\begin{align*}f_s(y_1,\ldots,y_{r-1},s_r) &\propto f_Y\left(y_1,\ldots,\left\{s_r-\sum_{i=1}^{r-1}y_i\right\}\Big/(n-r+1)\right) \\&\propto \theta^{-r} \exp\{-s_r/\theta\}\mathbb{I}_{y_\le y_2\le \ldots \le\left\{s_r-\sum_{i=1}^{r-1}y_i\right\}/(n-r+1)}\end{align*}implies by integration in $y_1,\ldots,y_{r-1}$ that$$f_s(s_r)\propto\theta^{-r} \exp\{-s_r/\theta\}s_r^{r-1}$$ Indeed, \begin{align*} f_s(s_r)&=\int\cdots\int f_s(y_1,\ldots,y_{r-1},s_r)\text{d}y_1\cdots\text{d}y_{r-1}\\ &= \theta^{-r} \exp\{-s_r/\theta\}\int\cdots\int \mathbb{I}_{y_\le y_2\le \ldots \le\left\{s_r-\sum_{i=1}^{r-1}y_i\right\}/(n-r+1)}\text{d}y_1\cdots\text{d}y_{r-1} \end{align*} leads to constraint $y_{r-1}$ by $y_{r-2}\le y_{r-1}$ and by $$y_{r-1}\le \left\{s_r-\sum_{i=1}^{r-1}y_i\right\}/(n-r+1)=\left\{s_r-\sum_{i=1}^{r-2}y_i\right\}/(n-r+1)-\frac{y_{r-1}}{n-r+1}$$ which simplifies into $$y_{r-1}\le \left\{s_r-\sum_{i=1}^{r-2}y_i\right\}/(n-r+2)$$ If one starts integrating in $y_{r-1}$, the most inner integral is \begin{align*}\int_{y_{r-2}}^{\{s_r-\sum_{i=1}^{r-2}y_i\}/(n-r+2)}\text{d}y_{r-1}&=\left\{s_r-\sum_{i=1}^{r-2}y_i\right\}/(n-r+2)-y_{r-2}\\ &=\left\{s_r-\sum_{i=1}^{r-3}y_i\right\}/(n-r+2)-\frac{(n-r+1)y_{r-2}}{n-r+2} \end{align*} and from there one can proceed by recursion.

Hence$$s_r\sim\mathcal{G}a(r,1/\theta)$$

Here is an R simulation to show the fit: obtained as follows

n=10
r=5    
sim=matrix(rexp(n*1e4),1e4,n)
sim=t(apply(sim,1,sort))
res=apply(sim[,1:r],1,sum)+(n-r)*sim[,5]
hist(res,prob=TRUE)
curve(dgamma(x,sh=(n-r),sc=1),add=TRUE)