$X_i \sim \text{Gamma}(\alpha, \beta)$ for $i=1,\dots, n$ where $\alpha, \beta >0$. Finding the pdf for the random variable $\frac{1}{n}\sum_i X_i$.

gamma distributionmoment-generating-functionsprobability distributionsstatistical-inference

$\newcommand{\G}[1]{\text{Gamma}(#1)}
\newcommand{\a}{\alpha}
\newcommand{\b}{\beta}
\newcommand{\rd}[1]{\mathrm{d}#1}$

${\bullet \textbf{ Basics for the question:}}$

Let $X_i$ be i.i.ds (independent identically distributed Random variables) in some distribution (say Gamma$(\alpha, \beta_i)$) where $\alpha, \beta_i>0$ for any $i =1, \dots, n$. Using the method of moments (by finding out the mgf of $\sum_i X_i$) I am able to show that $\sum_i X_i \sim \text{Gamma}(\alpha, \sum_i\beta_i)$.

This means if $X_i \sim \text{Gamma}(\alpha, \beta)$ for any $i=1,\dots, n$ and $\alpha, \beta >0$, we can conclude that $\sum_i X_i \sim \text{Gamma}(\alpha, n \beta)$.

We already know the pdf of each R.V $X_i$ given by
$$ f(x|\alpha,\beta) = \frac{\a^\b}{\Gamma(\beta)} {\color{red}e}^{-\a x} x^{\b-1} \qquad(x \in (0, \infty)). $$
We also know the pdf of $\sum_i X_i$ since we know that $\sum_i X_i \sim \G{\a, n \b}$. The gamma function is defined as:
$$ \Gamma(\beta) = \int_0^\infty e^{-t}t^{\b-1}\rd t \qquad(\b>0).$$

$\bullet\textbf{ Question:}$

I need to show that $$ \frac{\sum_i X_i}{n} \sim \G{n\alpha, n \beta} $$

My question is what will be the pdf of the distribution $\frac{1}{n}\sum_i X_i$?

$\bullet\textbf{ General Version of the question:}$

Say $X \sim \G{\a, \b}$, then what will be the pdf of the distribution $\frac{X}{n}$ for any constant $n$?

Best Answer

There is a formula for monotone transformations of a random variable: if $X$ has PDF $f_X(x)$, then the PDF of $Y = g(X)$ where $g$ is monotone and has a differentiable inverse, is

$$f_Y(y) = f_X(g^{-1}(y)) \left|\frac{dg^{-1}}{dy}\right|. \tag{1}$$

In your case, $Y = X/n$, hence $g(x) = x/n$ and $g^{-1}(y) = ny$. Thus $(1)$ becomes $$f_Y(y) = f_X(ny) \left|\frac{d}{dy}[ny]\right| = n f_X(ny). \tag{2}$$

We can also have derived this by considering the CDF:

$$F_Y(y) = \Pr[Y \le y] = \Pr[X/n \le y] = \Pr[X \le ny] = F_X(ny),$$ hence

$$f_Y(y) = \frac{d}{dy}[F_X(ny)] = n f_X(ny).$$

Thus, for $X \sim \operatorname{Gamma}(\alpha,\beta)$ where $\alpha$ is a shape parameter and $\beta$ is a rate parameter, $Y = X/n$ has PDF

$$f_Y(y) = n \frac{\beta^\alpha (ny)^{\alpha - 1} e^{-\beta ny}}{\Gamma(\alpha)} = \frac{(n\beta)^\alpha y^{\alpha - 1} e^{-(n\beta) y}}{\Gamma(\alpha)}, \tag{3}$$ which corresponds to a gamma PDF with shape $\alpha$ and rate $n\beta$.

Note that you appear to be using a nonstandard parametrization of the gamma distribution: typically, the first parameter $\alpha$ is the shape, and the second is the rate, or scale, parameter. Your statement that $$\sum_{i=1}^n X_i \sim \operatorname{Gamma}(\alpha, n\beta)$$ is inconsistent with this parametrization. The sum of $n$ IID gamma variables will be gamma with shape $n\alpha$, but the same rate $\beta$. Thus the correct formula should be $$\sum_{i=1}^n X_i \sim \operatorname{Gamma}(n\alpha, \beta)$$ and in particular, if $$f_X(x) = \frac{\beta^\alpha x^{\alpha-1} e^{-\beta x}}{\Gamma(\alpha)},$$ then the sum $S = \sum_i X_i$ will have density $$f_S(x) = \frac{\beta^{n\alpha} x^{n \alpha-1} e^{-\beta x}}{\Gamma(n \alpha)}.$$

Related Question