Solved – Normal approximation to the binomial distribution

approximationnormal distributionprobability

I am having trouble getting to the bottom of this concept for two types of questions (hw is already passed, but I have a test this week and would like to do better). Hopefully someone can help me get this through my thick skull.

Question 1

In the following problem, check that it is appropriate to use the
normal approximation to the binomial. Then use the normal distribution
to estimate the requested probabilities.

Do you try to pad an insurance claim to cover your deductible? About
39% of all U.S. adults will try to pad their insurance claims! Suppose
that you are the director of an insurance adjustment office. Your
office has just received 134 insurance claims to be processed in the
next few days.

(a) half or more of the claims have been padded

Question 2

Based on long experience, an airline found that about 6% of the people
making reservations on a flight from Miami to Denver do not show up
for the flight. Suppose the airline overbooks this flight by selling
272 ticket reservations for an airplane with only 255 seats.

Use the normal approximation to the binomial distribution to answer:
What is the probability that a seat will be available for every person
who shows up holding a reservation?


Adding per the comments posted. My understanding of the approximation is that given np > 5 and nq > 5 the approximation can be used, and I have in my notes the definitions of

 Mean = np 
 Sigma = Sqrt(npq)

Nowhere does it discuss how to figure out a probability, so I assume I use the probability from the binomial distribution (is this correct)

 P = nCk * p^k * q^(n-k)

But my other issues is that binomial distribution is for exactly k successes out of n attempts. That doesn't seem to be the problem for question 1 (but I suppose I could sum all of the k..0 (by -1) successes. But again I am a touch confused here as well.

Looking at question 2 it is more of the same, what is the probability that more than 255 people will show up . Maybe my question is more around the application of the binomial distribution. The textbook (brase and brase) is not super strong and I have been trying to augment with web and other texts.

I am taking this as a leading class in a graduate degree in Econ. I also have a team of computational finance guys that work indirectly for me, but I would like to be able to have better conversations with them, and I have always wanted to now more about probability. At the end of the day, I got a 94 on this hw assignment (so its not like this is killing my grades), but these two questions (with a few more parts to them) really stumped me. It seems like a trivial place to get stumped and I would hate for this to come back and bite me later on!


I think if I approach it this way:

p = .39
q = 1 – p
n = 134
Mean:  52.26 (approximated by np)
Sigma: 5.6 ( approximated by: sqrt(npq) )

test = 134 / 2 = 67

P(x > 67) 
~= P(x > 66.5)                                                the continuity correction
~= 1 - p(z > (x – mean) / sigma) 
~= 1 – p(z  >  (67 - 52.26) / 5.6)

thoughts?

Best Answer

I think you missed a critical section, and have some reading to do. This is one of the most applicable techniques from the course, so you should make sure you get this. If your textbook isn't good, read other sources.

When you approximate the distribution of $X$ by a normal distribution $Y$, presumably one with the same mean and standard deviation, this means you get to approximate $P(X \gt a)$ by $P(Y \gt a)$, and similarly $P(a \lt X \lt b) \approx P(a \lt y \lt b)$.

In these problems, you aren't supposed to be calculating the probability that a binomial random variable $X$ is greater than some value by summing the probability mass function for $X$. You are supposed to find a normal distribution which is a good approximation (match the mean and standard deviation), and then look up or estimate $P(y \gt a)$.

Another way to state this is that you should translate the condition into something like "at most $b$ standard deviations above the mean" and then estimate this as the probability that a standard normal distribution is in that range.

You probably also covered a continuity correction. When you have a discrete random variable which takes integer values, you may want to think of $n$ as the interval $(n-1/2, n+1/2).$ So, $P(X \ge 10)$ really means $P(X \ge 9.5) \approx P(Y \ge 9.5)$. You don't need to do this if $P(9.5 \lt Y \lt 10)$ is negligible, but in many small examples this correction makes the normal approximation much more accurate. There are even more corrections such as the skew-normal approximation, but those are less important and might not have been included in your class.