Statistics – How to Find MLE of a Piecewise PDF?

maximum likelihoodparameter estimationstatistical-inferencestatistics

Suppose $X_1,\ldots, X_n$ are i.i.d. random variables having pdf
$$
f_{\theta}(x)=\left\{\begin{array}{ll}{\theta,} & {0 \leqslant x \leqslant 1} \\ {1-\theta,} & {1<x \leqslant 2}\end{array}\right.
$$
Give the maximum likelihood estimate of $\theta$.

I know the likelihood function of $(X_1,\ldots, X_n)$ is
$$\ell(\theta;x)=\prod_{i=1}^{n}\left[\theta I(0<x_i<1)+(1-\theta)I(1<x_i<2)\right]$$
But I don't know how to compute $\frac{\partial \log \ell(\theta)}{\partial \theta}=0$ to get $\hat{\theta}$.

Is there anyone can tell me？

Best Answer

What you have written for the likelihood function is technically correct but you cannot reasonably derive an MLE in this setup because of the additive nature of the function. In this case, you can suppose that $n_1$ observations in the sample are in $[0,1]$ and $n_2$ observations are in $[1,2]$, with $n_1+n_2=n$.

Given the sample, rewrite the likelihood function as $$L(\theta)=\left(\prod_{i:0\le x_i\le 1}\theta\right)\left(\prod_{i:1<x_i\le 2}(1-\theta)\right)\quad,\,\theta\in(0,1)$$

This is the same as saying $$L(\theta)=\theta^{n_1}(1-\theta)^{n_2}\quad,\,\theta\in(0,1)$$

Differentiating the log-likelihood wrt $\theta$ yields

$$\frac{\partial}{\partial\theta}\ln L(\theta)=\frac{n_1}{\theta}-\frac{n_2}{1-\theta}$$

So a critical point of the likelihood is $$\hat\theta=\frac{n_1}{n_1+n_2}=\frac{n_1}{n}$$

Related Solutions

Statistics – Finding MLE of f(x; ?) = (? + 1)x^?

Note that because each $x_i \in (0,1)$, then $\log x_i \in (-\infty, 0)$, hence $$-\frac{n}{\sum \log x_i} \in (0, \infty).$$ It follows that $\hat \theta_{MLE} \in (-1, \infty)$, as desired. There is no issue.

Also note that your computation of the likelihood should be more precisely written as $$\mathcal L(\theta \mid \boldsymbol x) = \prod_{i=1}^n (\theta+1) x_i^\theta \mathbb 1 (0 < \color{red}{x_i} < 1) \mathbb 1 (\theta > -1).$$ This in turn is equivalent to $$\mathcal L (\theta \mid \boldsymbol x) = (\theta+1)^n \left(\prod_{i=1}^n x_i \right)^\theta \mathbb 1(0 < x_{(1)}) \mathbb 1(x_{(n)} > 1) \mathbb 1 (\theta > -1).$$

Statistics – MLE of Beta Distribution Explained

Additional comments: Your answer seems OK. It may be of interest to know that $\hat \theta$ is not unbiased. One can get a rough idea of the distribution of $\hat \theta$ for a particular $\theta$ by simulating many samples of size $n.$ I don't know of a convenient 'unbiasing' constant multiple. The Wikipedia article I linked in my Comment above gives more information.

Here is a simulation for $n = 10$ and $\theta = 5.$

th = 5;  n = 10
th.mle = -n/replicate(10^6, sum(log(rbeta(n, th, 1))))
mean(th.mle)
## 5.555069   # aprx expectation of th.mle > th = 5.
median(th.mle)
## 5.172145

The histogram below shows the simulated distribution of $\hat \theta.$ The vertical red line is at the mean of that distribution, and the green curve is its kernel density estimator (KDE). According to the KDE, its mode is near $4.62.$

den.inf = density(th.mle)
den.inf$x[den.inf$y==max(den.inf$y)]
## 4.624876

hist(th.mle, br=50, prob=T, col="skyblue2", main="")
abline(v = mean(th.mle), col="red")
lines(density(th.mle), lwd=2, col="darkgreen")

Addendum on Parametric Bootstrap Confidence Interval for $\theta:$

In order to find a confidence interval (CI) for $\theta$ based on MLE $\hat \theta,$ we would like to know the distribution of $V = \frac{\hat \theta}{\theta}.$ When that distribution is not readily available, we can use a parametric bootstrap.

If we knew the distribution of $V,$ then we could find numbers $L$ and $U$ such that $P(L \le V = \hat\theta/\theta \le U) = 0.95$ so that a 95% CI would be of the form $\left(\frac{\hat \theta}{U},\, \frac{\hat\theta}{L}\right).$ Because we do not know the distribution of $V$ we use a bootstrap procedure to get serviceable approximations $L^*$ and $U^*$ of $L$ and $U.$ respectively.

To begin, suppose we have a random sample of size $n = 50$ from $\mathsf{Beta}(\theta, 1)$ where $\theta$ is unknown and its observed MLE is $\hat \theta = 6.511.$

Entering, the so-called 'bootstrap world'. we take repeated 're-samples` of size $n=50$ from $\mathsf{Beta}(\hat \theta =6.511, 0),$ Then we we find the bootstrap estimate $\hat \theta^*$ from each re-sample. Temporarily using the observed MLE $\hat \theta = 6.511$ as a proxy for the unknown $\theta,$ we find a large number $B$ of re-sampled values $V^* = \hat\theta^2/\hat \theta.$ Then we use quantiles .02 and .97 of these $V^*$'s as $L^*$ and $U^*,$ respectively.

Returning to the 'real world' the observed MLE $\hat \theta$ returns to its original role as an estimator, and the 95% parametric bootstrap CI is $\left(\frac{\hat\theta}{U^*},\, \frac{\hat\theta}{L^*}\right).$

The R code, in which re-sampled quantities are denoted by .re instead of $*$, is shown below. For this run with set.seed(213) the 95% CI is $(4.94, 8.69).$ Other runs with unspecified seeds using $B=10,000$ re-samples of size $n = 50$ will give very similar values. [In a real-life application, we would not know whether this CI covers the 'true' value of $\theta.$ However, I generated the original 50 observations using parameter value $\theta = 6.5,$ so in this demonstration we do know that the CI covers the true parameter value $\theta.$ We could have used the probability-symmetric CI with quantiles .025 and .975, but the one shown is a little shorter.]

set.seed(213)
B = 10000;  n = 50;  th.mle.obs=6.511
v.re = th.mle.obs/replicate(B, -n/sum(log(rbeta(n,th.mle.obs,1))))
L.re = quantile(v.re, .02);  U.re = quantile(v.re, .97)
c(th.mle.obs/U.re, th.mle.obs/L.re)
##     98%       3% 
## 4.936096 8.691692

Best Answer

Related Solutions

Statistics – Finding MLE of f(x; ?) = (? + 1)x^?

Statistics – MLE of Beta Distribution Explained

Related Question