Finding maximum likelihood estimator of $\theta$

maximum likelihoodparameter estimationstatistical-inferencestatisticsuniform distribution

Let $X_1, \ldots, X_n$ be independent random variables with density

$$
f(x;\theta) =\begin{cases} \frac{1}{2i\theta} &, -i(\theta-1)\le x \le i(\theta+1)
\\ 0&, \text{ elsewhere }
\end{cases}
$$ for $0<\theta<\infty$ and $i=1,2,\ldots,n$.

Find the maximum likelihood estimator of $\theta$.

My approach:

$$f(x;\theta) = \frac{1}{2i\theta} I(_{-i(\theta-1)\le x \le i(\theta+1)})$$

How do we talk about the maximum or minimum values to say about mle . please throw some light???

Best Answer

The density of $X_i$ is given by $$f_{X_i}(x \mid \theta) = (2i\theta)^{-1} \mathbb 1(-i(\theta-1) \le x_i \le i(\theta+1)).$$ Thus the joint density of the sample $\boldsymbol X = (X_1, X_2, \ldots, X_n)$ is $$\begin{align} f_{\boldsymbol X}(\boldsymbol x \mid \theta) &= \prod_{i=1}^n (2i\theta)^{-1} \mathbb 1(-i(\theta-1) \le x_i \le i(\theta+1)) \\ &= 2^n n! \theta^{-n} \prod_{i=1}^n \mathbb 1\left(\left|\frac{x_i}{i} - 1\right| \le \theta\right) \\ &= 2^n n! \theta^{-n} \mathbb 1 \left( \max_i \left|\frac{x_i}{i} - 1\right| \le \theta\right). \end{align}$$ hence the likelihood for $\theta$ given the sample is proportional to $$\begin{align} \mathcal L(\theta \mid \boldsymbol x) &\propto \theta^{-n} \mathbb 1 \left( \max_i \left|\frac{x_i}{i} - 1\right| \le \theta\right). \end{align}$$ If the indicator function is $1$, then $\mathcal L$ is monotonically decreasing in $\theta$. Therefore, the likelihood is maximized for a choice of $\theta$ that is as small as possible while allowing the indicator to remain $1$; i.e., $$\hat \theta = \max_i \left|\frac{x_i}{i} - 1\right|.$$ This function is evaluated by taking the sample $(x_1, x_2, \ldots, x_n)$, and calculating the modified sample $$\left(\frac{x_1}{1} - 1, \frac{x_2}{2} - 1, \ldots, \frac{x_n}{n} - 1\right).$$ The maximum absolute value is the MLE.

Related Solutions

[Math] Existence of Maximum Likelihood Estimator

\begin{align} & \text{For } \mu \le \min\{x_1,\ldots,x_n\} \text{ and } \alpha>0, \text{we have} \\[10pt] L(\mu,\alpha) & = \frac 1 {\Gamma(\alpha)^n} \left( \prod_{i=1}^n (x_i-\mu) \right)^{\alpha-1} \!\!\! \exp \left( -\sum_{i=1}^n (x_i-\mu) \right), \\[10pt] \ell(\mu,\alpha) & = \log L(\mu,\alpha) = -n\log\Gamma(\alpha) + (\alpha-1) \sum_{i=1}^n \log(x_i-\mu) - \sum_{i=1}^n (x_i-\mu). \end{align}

You gave us $\alpha<1.$

That implies $\alpha-1<0,$ so that $\ell(\mu,\alpha)$ is an increasing function of $\mu$ until $\mu$ gets as big as $\min\{x_1,\ldots,x_n\}.$

Therefore $\widehat\mu = \min\{x_1,\ldots,x_n\}.$ If we didn't have the constraint that $\alpha<1,$ then this would be more complicated.

This value of $\widehat\mu$ does not depend on $\alpha$ as long as $\alpha$ remains in that interval. Therefore we can just plug in $\min$ for $\mu$ and then seek the value of $\alpha\in(0,1)$ that maximizes $\ell(\min,\alpha).$

Now we have $$ \ell(\min,\alpha) = -n\log\Gamma(\alpha) + (\alpha-1)A + \big( \text{constant} \big) $$ where "constant" means not depending on $\alpha.$ $$ \frac {\partial\ell}{\partial\alpha} = -n\frac{\Gamma'(\alpha)}{\Gamma(\alpha)} + A. $$ Etc.

Statistics – MLE of Beta Distribution Explained

Additional comments: Your answer seems OK. It may be of interest to know that $\hat \theta$ is not unbiased. One can get a rough idea of the distribution of $\hat \theta$ for a particular $\theta$ by simulating many samples of size $n.$ I don't know of a convenient 'unbiasing' constant multiple. The Wikipedia article I linked in my Comment above gives more information.

Here is a simulation for $n = 10$ and $\theta = 5.$

th = 5;  n = 10
th.mle = -n/replicate(10^6, sum(log(rbeta(n, th, 1))))
mean(th.mle)
## 5.555069   # aprx expectation of th.mle > th = 5.
median(th.mle)
## 5.172145

The histogram below shows the simulated distribution of $\hat \theta.$ The vertical red line is at the mean of that distribution, and the green curve is its kernel density estimator (KDE). According to the KDE, its mode is near $4.62.$

den.inf = density(th.mle)
den.inf$x[den.inf$y==max(den.inf$y)]
## 4.624876

hist(th.mle, br=50, prob=T, col="skyblue2", main="")
abline(v = mean(th.mle), col="red")
lines(density(th.mle), lwd=2, col="darkgreen")

Addendum on Parametric Bootstrap Confidence Interval for $\theta:$

In order to find a confidence interval (CI) for $\theta$ based on MLE $\hat \theta,$ we would like to know the distribution of $V = \frac{\hat \theta}{\theta}.$ When that distribution is not readily available, we can use a parametric bootstrap.

If we knew the distribution of $V,$ then we could find numbers $L$ and $U$ such that $P(L \le V = \hat\theta/\theta \le U) = 0.95$ so that a 95% CI would be of the form $\left(\frac{\hat \theta}{U},\, \frac{\hat\theta}{L}\right).$ Because we do not know the distribution of $V$ we use a bootstrap procedure to get serviceable approximations $L^*$ and $U^*$ of $L$ and $U.$ respectively.

To begin, suppose we have a random sample of size $n = 50$ from $\mathsf{Beta}(\theta, 1)$ where $\theta$ is unknown and its observed MLE is $\hat \theta = 6.511.$

Entering, the so-called 'bootstrap world'. we take repeated 're-samples` of size $n=50$ from $\mathsf{Beta}(\hat \theta =6.511, 0),$ Then we we find the bootstrap estimate $\hat \theta^*$ from each re-sample. Temporarily using the observed MLE $\hat \theta = 6.511$ as a proxy for the unknown $\theta,$ we find a large number $B$ of re-sampled values $V^* = \hat\theta^2/\hat \theta.$ Then we use quantiles .02 and .97 of these $V^*$'s as $L^*$ and $U^*,$ respectively.

Returning to the 'real world' the observed MLE $\hat \theta$ returns to its original role as an estimator, and the 95% parametric bootstrap CI is $\left(\frac{\hat\theta}{U^*},\, \frac{\hat\theta}{L^*}\right).$

The R code, in which re-sampled quantities are denoted by .re instead of $*$, is shown below. For this run with set.seed(213) the 95% CI is $(4.94, 8.69).$ Other runs with unspecified seeds using $B=10,000$ re-samples of size $n = 50$ will give very similar values. [In a real-life application, we would not know whether this CI covers the 'true' value of $\theta.$ However, I generated the original 50 observations using parameter value $\theta = 6.5,$ so in this demonstration we do know that the CI covers the true parameter value $\theta.$ We could have used the probability-symmetric CI with quantiles .025 and .975, but the one shown is a little shorter.]

set.seed(213)
B = 10000;  n = 50;  th.mle.obs=6.511
v.re = th.mle.obs/replicate(B, -n/sum(log(rbeta(n,th.mle.obs,1))))
L.re = quantile(v.re, .02);  U.re = quantile(v.re, .97)
c(th.mle.obs/U.re, th.mle.obs/L.re)
##     98%       3% 
## 4.936096 8.691692

Best Answer

Related Solutions

[Math] Existence of Maximum Likelihood Estimator

Statistics – MLE of Beta Distribution Explained

Related Question