[Math] UMVUE of a parameter for Pareto Distribution

probability distributionsstatistical-inferencestatistics

Problem: Let $ (X_1,X_2, \ldots, X_n) $ be a random sample from Pareto Distribution with pdf

$ f(x; \theta) = (\frac{\theta}{c})(\frac{c}{x})^{\theta + 1}, ~~ x> c $. Find the UMVUE of parameter $ \theta$.

My work so far: I proved that the UMVUE cannot be found through Cramer-Rao method. Then

I tried with the classic Rao Blackwell method: I proved that the given Pareto Distribution is

included in the Exponential Family and $T = \sum_{i=1}^{n} \log X_i $ is a complete and sufficient statistic

of $ \theta $. Now I am stuck because I could not find an unbiased estimator of $ \theta $ to proceed.

Best Answer

We have the joint pdf $$ f(\vec x ; \theta) = \theta^n c^{\theta n} \prod_{i=1}^n x_i^{-(\theta+1)}\mathbb{1}_{x_i \ge c} =\mathbb{1}_{x_{(1)} \ge c} \left[ \theta^n c^{\theta n} \right] \exp \left[ -(\theta+1) \sum_{i=1}^n \ln x_i\right] $$ and so by the Exponential-Family factorization $\sum_{i=1}^n \ln X_i$ is complete & sufficient for the distribution.

For a preliminary result, consider $Y = \ln(X) - \ln(c)$ where $X$ follows from the given pareto distribution i.e. $f_X(x) = \theta c^\theta x^{-(\theta+1)} \mathbb{1}_{x \ge c}$. Then, since $X = ce^Y$, we get $$ f_Y(y) = f_X(ce^Y) \left\vert \frac{dx}{dy} \right\vert = \theta c^\theta (c e^Y)^{-(\theta+1)} \mathbb{1}_{ ce^Y \ge c} \cdot ce^Y = \theta e^{-y \theta} \mathbb{1}_{ y \ge 0} $$ which is the pdf of an exponential rate = $\theta$ distribution. Define $Y_i := \ln(X_i) - \ln(c)$

It follows that $\sum_{i=1}^n Y_i = \sum_{i=1}^n (\ln X_i - \ln (c))$ follows a $\Gamma(n,\theta)$ distribution since it's the sum of $n$ independent exponential rate $\theta$ random variables. Note that the mean of an exponential rate $\theta$ r.v. is $1/\theta$ and the mean of a $\Gamma(n,\theta)$ r.v. is $n/\theta$.

So $\frac{1}{n} \sum_{i=1}^n Y_i$ is an unbiased estimator of $1/\theta$, and it's natural to guess that $1/ \left( \frac{1}{n} \sum_{i=1}^n Y_i \right)$ is an unbiased estimator of $\theta$.

Let $Z \sim \Gamma(n,\theta)$. Then, the expecation of $1/ \left( \frac{1}{n} \sum_{i=1}^n Y_i \right)$ equals: \begin{align*} E \left[ \frac{n}{Z} \right] &= n \int_0^\infty \frac{1}{z} \frac{1}{\Gamma(n)} \theta^n z^{n-1} e^{- \theta z} \; dz \\ &= n \int_0^\infty \frac{1}{\Gamma(n)} \theta^n z^{n-2} e^{- \theta z} \; dz \\ &= n \frac{ \theta \Gamma(n-1)}{\Gamma(n)} \int_0^\infty \frac{1}{\Gamma(n-1)} \theta^{n-1} z^{n-2} e^{- \theta z} \; dz \end{align*} and this equals $\theta n \dfrac{ (n-2)!}{(n-1)!}= \frac{n}{n-1} \theta$ since the rightmost integral is integrating the pdf of a $\Gamma(n-1,\theta)$ random variable over its support.

It follows from Lehmann Scheffe that $\dfrac{n-1}{n} \cdot \dfrac{1}{\frac{1}{n} \sum_{i=1}^n Y_i} = \dfrac{n-1}{\sum_{i=1}^n (\ln X_i - \ln c) }$ is the UMVUE of $\theta$.