[Math] Why do we use only 1/2 for continuity correction in case of approximating binomial random varable to a standard normal random variable

normal distributionprobabilityprobability distributionsprobability theory

I have read about continuity correction in case of approximating a binomial random variable to a standard normal variable. But in all the examples , they only use 1/2 as a continuity correction factor.Let's take an example:

Let X be the number of times that a fair coin, flipped 40 times,lands heads. Find the probability that X=20.
X is a binomial random variable with mean 20 and variance 10.
The actual answer is 12.54%

Use the normal approximation for this question.

\begin{align}
Ans: P(X=20) =
P(19.5 <= X <= 20.5)
= P( \frac{19.5-20}{\sqrt(10)} < \frac{X-20}{\sqrt(10)} < \frac{20.5-20}{\sqrt(10)} )
= P(-0.16 < \frac{X-20}{\sqrt(10)}< 0.16)
= fi(-0.16) – fi(0.16)
= 12.72 \%
\end{align}

         ; where fi is the distribution function of a normal random variable

This is pretty close to the actual answer..

Now if I could have written

\begin{align}
P(X=20) =
P(19.99 <= X <= 20.01)
= P( \frac{19.99-20}{\sqrt(10)} < \frac{X-20}{\sqrt(10)} < \frac{20.01-20}{\sqrt(10)} )
= P(-0.031 < \frac{X-20}{\sqrt(10)}< 0.031)
= fi(0.031) – fi(-0.031)
= 2.4 \%
\end{align}
I have brought the range very close to 20 instead of just using (19.5,20.5) .

Why is the approximation in second case worst than that in first case ??

Best Answer

First of all, you did the second calculation incorrectly. You should have had $F(0.0032)-F(-0.0032)$. If you do this, you will see that the result of the second calculation is approximately 50 times smaller than the result of the first. (This comes from continuity of the normal PDF.) This is to be expected because you're using an interval of length $0.02=1/50$ instead of an interval of length $1$. That is, a normal approximation of $50 (F(0.0032)-F(-0.0032))$ would give you a very similar result to the first calculation.

The rest of this is a somewhat subtle discussion about this issue, which may not be quite suitable for a beginning statistics student.

The meaningful thing is that the binomial CDF $F_b$ and the appropriate normal CDF $F_n$ are close to one another. The continuity correction is used to relate this to the binomial PMF $f_b$. The problem is that $f_b$ can be represented in any number of different ways in terms of $F_b$.

Specifically, for any two real numbers $x,y \in [0,1)$, $f_b(m)=F_b(m+x)-F_b(m-1+y)$. To approximate $f_b(m)$ using $F_n$, we choose $x$ and $y$ and then approximate these values of $F_b$ using the corresponding values of $F_n$.

But it is not good enough to just get an acceptable approximation of these values of $F_b$, because when $n$ (the number of binomial trials) is large, these two numbers are very close. As a result small relative errors in our approximations of $F_b(m+x)$ and $F_b(m-1+y)$ can cause large relative errors in the difference. (As an example, consider $x=1.001$, $y=1$, and attempting to approximate $x-y$ when replacing $x$ with $z=1.0015$. Here $z$ and $x$ are relatively quite close together but $x-y$ and $z-y$ are relatively quite far apart.)

It turns out that we can only make this error even moderately small if $x=y$. More precisely, if $x \neq y$ then our value will be off by a factor of about $x-y+1$, even as $n \to \infty$. This is essentially because the values of the binomial variable are separated by intervals of length $1$; the precise details involve analytical tricks that are beyond the scope of both calculus and elementary statistics.

Moreover the error is minimized in an appropriate sense when $x=y=1/2$. Nevertheless, if $X$ is $Bin(n,p)$ and $Y$ is $N(np,np(1-p))$, then $P(X=m)$ is still well-approximated by, for example, $P(m \leq Y \leq m+1)$. It is just that $P(m-1/2 \leq Y \leq m+1/2)$ is a better approximation, especially if $m$ is close to $np$.