I've been trying to learn which distributions to use in GLMs, and I'm a little fuzzled on when to use the normal distribution. In one part of my textbook, it says that a normal distribution could be good for modeling exam scores. In the next part, it asks what distribution would be appropriate to model a car insurance claim. This time, it said that the appropriate distributions would be Gamma or Inverse Gaussian because they're continuous with only positive values. Well, I believe that exam scores would also be continuous with only positive values, so why would we use a normal distribution there? Doesn't the normal distribution allow for negative values?
Solved – Do test scores really follow a normal distribution
gamma distributiongeneralized linear modelinverse-gaussian-distributionnormal distribution
Related Solutions
As suggested by probabilityislogic, my comment is converted to an answer.
You need to integrate by parts repeatedly beginning with $u=y^{\alpha-1}$, $v=−e^{−y}$, $\mathrm dv=e^{−y}\mathrm dy$, and using $$\int_0^x u\ \mathrm dv= uv\biggr|_0^x − \int_0^x v\ \mathrm du.$$ Since $\mathrm du = (\alpha-1)y^{\alpha-2}\mathrm dy$, each time you do an integration by parts, you will get an integral with a smaller exponent for $y$ on the right hand side. If $\alpha$ is an integer (as it is in this particular case), you will be able to finish up the process with a $\int_0^x e^{-y}\mathrm dy$. If $\alpha$ is not an integer, things are more complicated because there is no general closed-form expression for $\int_0^x y^{\gamma}e^{-y}\mathrm dy$ where $0 < \gamma < 1$. As noted by Xi'an, the cdf is the incomplete gamma function, and its numerical values have been tabulated.
If integration by parts is not the point of this exercise as suggested in Elvis's comment, you may want to check if your professor wants you to think of the value of a gamma random variable as an arrival time in a Poisson random process and solve the problem from that viewpoint.
I think you missed a critical section, and have some reading to do. This is one of the most applicable techniques from the course, so you should make sure you get this. If your textbook isn't good, read other sources.
When you approximate the distribution of $X$ by a normal distribution $Y$, presumably one with the same mean and standard deviation, this means you get to approximate $P(X \gt a)$ by $P(Y \gt a)$, and similarly $P(a \lt X \lt b) \approx P(a \lt y \lt b)$.
In these problems, you aren't supposed to be calculating the probability that a binomial random variable $X$ is greater than some value by summing the probability mass function for $X$. You are supposed to find a normal distribution which is a good approximation (match the mean and standard deviation), and then look up or estimate $P(y \gt a)$.
Another way to state this is that you should translate the condition into something like "at most $b$ standard deviations above the mean" and then estimate this as the probability that a standard normal distribution is in that range.
You probably also covered a continuity correction. When you have a discrete random variable which takes integer values, you may want to think of $n$ as the interval $(n-1/2, n+1/2).$ So, $P(X \ge 10)$ really means $P(X \ge 9.5) \approx P(Y \ge 9.5)$. You don't need to do this if $P(9.5 \lt Y \lt 10)$ is negligible, but in many small examples this correction makes the normal approximation much more accurate. There are even more corrections such as the skew-normal approximation, but those are less important and might not have been included in your class.
Best Answer
Height, for instance, is often modelled as being normal. Maybe the height of men is something like 5 foot 10 with a standard deviation of 2 inches. We know negative height is unphysical, but under this model, the probability of observing a negative height is essentially zero. We use the model anyway because it is a good enough approximation.
All models are wrong. The question is "can this model still be useful", and in instances where we are modelling things like height and test scores, modelling the phenomenon as normal is useful despite it technically allowing for unphysical things.