Statistics – Justifying Normal Approximation to Binomial Distribution Through MGFs

calculusmoment-generating-functionsstatistical-inferencestatistics

Would absolutely love if someone could help me with this question, in a step by step way to help those who are uninitiated to Statistics and Mathematics.

So, I am trying to "prove/justify" through MGFs how as n(the sample size) increases and goes to infinity, a standardized binomial distribution converges to the Standard Normal Distribution. So in the beginning we have $X_n $~$ Bin(n,p)$ then we standardize this $R.V.$ by subtracting the mean $E(X_n)=np$ and dividing by the $SD$ which is $np(1-p)$. After doing this, we get a new $R.V.$ with $mean=1$ and $variance=SD=1$. It is the distribution of this standardized quantity that converges to a fixed distribution, correct? Namely the Standard Normal Distribution. Let's call the standardized version of $X_n$, $Z_n$.

I think what we need to do is to find the MGF of $Z_n$ and show that this MGF becomes the Standard Normal Distribution's MGF as the sample size $(n)$ heads toward infinity. Now $Z_n$ is a function of $X_n$ which we know the MGF of which is $(1-p+pe^t)^n$.

Now I know that $Z_n$ is a linear transform and that we should be able to use this to simply the MGF but I don't understand how the course notes do it, mostly because they skip a lot of steps and it is difficult to follow. There is also a Taylor Series Expansion for the exponential that I am confused about although I do understand the concept of the series expansion.

Help would be greatly appreciated! I would love assistant in exactly how we can prove this via MGF.

Thanks so much in advance 🙂

Best Answer

The MGF for $X_n$ is as you already have, $(1-p+pe^t)^n$.

Note that for small $t$, we have

$$1-p+pe^t=1+pt+\frac{pt^2}{2}+O(t^3)$$

Thus, we can manipulate $n$th power by introducing logarithm. $$(1+pt+\frac{pt^2}{2}+O(t^3))^n=\exp\left(n\log\left(1+pt+\frac{pt^2}{2}+O(t^3)\right)\right)$$ $$=\exp\left(n(pt+\frac{pt^2}{2}-\frac{p^2t^2}{2}+O(t^3))\right)$$

Then it follows that $$\mathbb{E}(\exp(t(X_n-np)))=\exp(-npt)\exp\left(n(pt+\frac{pt^2}{2}-\frac{p^2t^2}{2}+O(t^3))\right)$$ $$=\exp\left(n\left(\frac{p-p^2}{2}t^2\right)+O(t^3n)\right)$$

Use the definition of $Z_n$, then the MGF for $Z_n$ becomes $$\exp\left(\frac{1}{2}n(p-p^2)\frac{t^2}{np(1-p)}+ O(\frac{t^3}{\sqrt{n}^3}n)\right)$$

The error term is just $O(t^3/\sqrt{n})$ and this converges to $0$.

Hence the MGF for $Z_n$ converges to $\exp(t^2/2)$ which is precisely the MGF for the standard normal distribution.