Use MGF to show $\hat\beta$ is a consistent estimator of $\beta$

convergence-divergencestatistical-inferencestatistics

Suppose that $X_1,….,X_n$ is a random sample from a gamma distribution with parameters $\alpha= 2, \beta$.

\begin{equation}
f(x)= \frac{x e^{(-x/ \beta)}}{\beta^2}, x>0
\end{equation}
(a) Find the maximum likelihood estimator $\hat\beta$ of $\beta$ and show it is unbiased.

(b) Use the moment generating function (MGF) to show that $\hat\beta$ is a consistent estimator of $\beta$.

I Know the answer for part (a) $\hat\beta= \frac{1}{2n} \sum {X_i}$ and then show $E[\hat\beta]= \beta$.

I need help in part (b) I know how to show that an estimator is consistent by using convergence in probability definition or
Chebyshev's Inequality. but I have no idea how to use MGF to show an estimator is consistent.

Best Answer

Show that the mgf of $\hat{\beta},$ $\mathbb{E}(\exp(\hat{\beta}t))$ converges to $\exp(\beta t).$ Note that $\exp(\beta t)$ is the MGF of the degenerate random variable $\beta.$

The convergence of MGF implies convergence in distribution. Reference

Convergence in distribution to a constant implies convergence in probability to the same constant. Reference

Related Solutions

Statistics – MLE of Beta Distribution Explained

Additional comments: Your answer seems OK. It may be of interest to know that $\hat \theta$ is not unbiased. One can get a rough idea of the distribution of $\hat \theta$ for a particular $\theta$ by simulating many samples of size $n.$ I don't know of a convenient 'unbiasing' constant multiple. The Wikipedia article I linked in my Comment above gives more information.

Here is a simulation for $n = 10$ and $\theta = 5.$

th = 5;  n = 10
th.mle = -n/replicate(10^6, sum(log(rbeta(n, th, 1))))
mean(th.mle)
## 5.555069   # aprx expectation of th.mle > th = 5.
median(th.mle)
## 5.172145

The histogram below shows the simulated distribution of $\hat \theta.$ The vertical red line is at the mean of that distribution, and the green curve is its kernel density estimator (KDE). According to the KDE, its mode is near $4.62.$

den.inf = density(th.mle)
den.inf$x[den.inf$y==max(den.inf$y)]
## 4.624876

hist(th.mle, br=50, prob=T, col="skyblue2", main="")
abline(v = mean(th.mle), col="red")
lines(density(th.mle), lwd=2, col="darkgreen")

Addendum on Parametric Bootstrap Confidence Interval for $\theta:$

In order to find a confidence interval (CI) for $\theta$ based on MLE $\hat \theta,$ we would like to know the distribution of $V = \frac{\hat \theta}{\theta}.$ When that distribution is not readily available, we can use a parametric bootstrap.

If we knew the distribution of $V,$ then we could find numbers $L$ and $U$ such that $P(L \le V = \hat\theta/\theta \le U) = 0.95$ so that a 95% CI would be of the form $\left(\frac{\hat \theta}{U},\, \frac{\hat\theta}{L}\right).$ Because we do not know the distribution of $V$ we use a bootstrap procedure to get serviceable approximations $L^*$ and $U^*$ of $L$ and $U.$ respectively.

To begin, suppose we have a random sample of size $n = 50$ from $\mathsf{Beta}(\theta, 1)$ where $\theta$ is unknown and its observed MLE is $\hat \theta = 6.511.$

Entering, the so-called 'bootstrap world'. we take repeated 're-samples` of size $n=50$ from $\mathsf{Beta}(\hat \theta =6.511, 0),$ Then we we find the bootstrap estimate $\hat \theta^*$ from each re-sample. Temporarily using the observed MLE $\hat \theta = 6.511$ as a proxy for the unknown $\theta,$ we find a large number $B$ of re-sampled values $V^* = \hat\theta^2/\hat \theta.$ Then we use quantiles .02 and .97 of these $V^*$'s as $L^*$ and $U^*,$ respectively.

Returning to the 'real world' the observed MLE $\hat \theta$ returns to its original role as an estimator, and the 95% parametric bootstrap CI is $\left(\frac{\hat\theta}{U^*},\, \frac{\hat\theta}{L^*}\right).$

The R code, in which re-sampled quantities are denoted by .re instead of $*$, is shown below. For this run with set.seed(213) the 95% CI is $(4.94, 8.69).$ Other runs with unspecified seeds using $B=10,000$ re-samples of size $n = 50$ will give very similar values. [In a real-life application, we would not know whether this CI covers the 'true' value of $\theta.$ However, I generated the original 50 observations using parameter value $\theta = 6.5,$ so in this demonstration we do know that the CI covers the true parameter value $\theta.$ We could have used the probability-symmetric CI with quantiles .025 and .975, but the one shown is a little shorter.]

set.seed(213)
B = 10000;  n = 50;  th.mle.obs=6.511
v.re = th.mle.obs/replicate(B, -n/sum(log(rbeta(n,th.mle.obs,1))))
L.re = quantile(v.re, .02);  U.re = quantile(v.re, .97)
c(th.mle.obs/U.re, th.mle.obs/L.re)
##     98%       3% 
## 4.936096 8.691692

[Math] Find the maximum likelihood estimator for Pareto distribution and a unbiased estimator

You've got some notation errors and the work is a bit sloppy, but it is essentially the correct idea. You should have written $$f(x; \alpha, \theta) = \alpha \theta^\alpha x^{-(\alpha+1)}, \quad x \ge \color{red}{\theta},$$ and $$\ell(\theta) = \log \mathcal L(\theta) = n \log \alpha + \alpha n \log \theta - (\alpha + 1) \sum_{i=1}^n \log x_i.$$ In fact, I would have dispensed with this altogether and noted that when $\alpha$ is known, the likelihood is proportional to $$\mathcal L(\theta) \propto \theta^\alpha \mathbb 1(x_{(1)} \ge \theta),$$ hence for $\alpha > 0$, $\mathcal L$ is monotone increasing on the interval $\theta \in (0, x_{(1)}]$ and the MLE is $\hat\theta = x_{(1)}$. No need to take log-likelihoods.

$\hat \theta = x_{(1)}$ is necessarily biased because $\Pr[X_{(1)} > \theta] > 0$ but $\Pr[X_{(1)} < \theta] = 0$. That is to say, the sample minimum can never be less than $\theta$, whereas being greater than it is certainly possible; so taking the expected value of the sample minimum, you can never hope to be equal to $\theta$ on average.

Formally, though, you would need to compute $\operatorname{E}[X_{(1)}]$ by first computing the probability density of the first order statistic. This in turn can be found by considering $$\Pr[X_{(1)} > x] = \Pr[(X_1 > x) \cap (X_2 > x) \cap \ldots \cap (X_n > x)] = ?$$

Best Answer

Related Solutions

Statistics – MLE of Beta Distribution Explained

[Math] Find the maximum likelihood estimator for Pareto distribution and a unbiased estimator

Related Question