Central Limit Theorem – How the Central Limit Theorem Applies to the Normal Distribution of Negative Binomial

central limit theoremgeometric-distributionnegative-binomial-distributionself-study

The question is:

Explain why the negative binomial distribution will be approximately normal if the parameter k is large enough. What are the parameters of this normal approximation?

I have previously asked about parts of this question before but wanting to confirm that my thinking is correct on this as it is the answer I am least comfortable with. I have provided all my working below but I am unsure as to whether the variance I have stated for the Normal distribution is correct from my working or if I have missed/dropped an important variable from the calculation.

1.6(c)
From the Central Limit Theorem we know that as the number of samples from any distribution increases, it becomes better approximated by a normal distribution. The equation to show this is:
$\Sigma_{i=1}^nX_i\underset{n\rightarrow\infty}\rightarrow\mathcal{N}(n\mu_x,\sigma{^2}_{\Sigma X}=\sigma^2)$
By defining a negative binomial distribution as the sum of k Geometric distributions:
$X=Y_1+Y_2+…+Y_k=\Sigma^k_{i=1}Y_i$
Where $Y_i\sim{Geometric(\pi)}$
Therefore, as k increases, we can restate the Central Limit Theorem as:
$\Sigma_{i=1}^kY_i\underset{k\rightarrow\infty}\rightarrow\mathcal{N}(k\mu_y,\sigma{^2}_{\Sigma X}=\sigma^2)$
As we have shown that the negative binomial distribution X can be represented as a collection of k independent and identically distributed geometric distributions $\Sigma^k_{i=1}Y_i$
$E[X]=k\times E[Y_i]=k\times\mu_y=\dfrac{k}{\pi}$
We also know that the variance of a geometric distribution is given by the following:
$Var(Y_i)=\dfrac{(1-\pi)}{\pi^2}$
So, for a negative binomial distribution, as k becomes larger, it can be shown that it is able to be approximated by:
$X\sim\mathcal{N}(\mu=\mu_x,\sigma^2=\dfrac{1-\pi}{\pi^2})$

Best Answer

You also can use CLT directly,one form of CLT states:

$\frac{\sum_{i=1}^nX_i-n\mu}{\sigma\sqrt{n}}\sim N(0,1)=\Rightarrow\sum_{i=1}^nX_i\sim N(n\mu,n\sigma^2)$

Above equations invovle two theorems: The first one is one form CLT

enter image description here

The second related to multivariate normal distribution, but it also apply to 1-dimensional random vector.

enter image description here

For your case:

$\sum_{i=1}^k Y_i \sim N(\frac{k}{\pi},k\frac{1-\pi}{\pi^2})$

Related Question