Normal Distribution Model for Discrete Probability – Is It Possible?

normal distributionprobability

If one wants to find the probability that a continuous random variable will fall within a range of $a \leq X \leq b$, based on a mean value $\mu$, and a deviation of $\sigma$, he would integrate the normal distribution function:

$$\int^b_a \frac{e^{-\frac{(x-\mu)^2}{2\sigma^2}}}{\sigma\sqrt{2\pi}}dx$$

Since this is for continuous probability, is there an alternative to normal distribution for discrete probability?


Suppose $\mu = 100$, and $\sigma = 50$. For discrete probability, I would try to use bounds close together to achieve a similar, but still not the completely desired outcome. However, the probability is very low:

$$\int^{99.95}_{100.05}p(x)dx = 0.0008$$

The probability seems too low for this to be true, which suggests that a different model for discrete probability should exist.

Best Answer

Your ultimate goal is not clear. Perhaps I can flounder around and make some useful comments.

For appropriate choices of $n$ and $\theta,$ the distribution $Binom(n, \theta)$ is approximately normal, especially if $n$ is large and $\theta$ is not too far from 1/2. The mean is $\mu = n\theta$ and the variance is $\sigma^2 = n\theta(1-\theta).$

Also, for large enough $\lambda,$ the distribution $Pois(\lambda)$ is nearly normal. The mean and variance are $\mu = \lambda$ and $\sigma^2 = \lambda.$ However, the Poisson model may have less flexibility in matching what you want.

Of course, to find the probability that a random variable taking integer values lies in an interval $(a, b]$ you will add probabilities for integer values in that interval, rather than evaluating an integral.

For example, if $X \sim Binom(n = 100, \theta = 1/2),$ you have $\mu = 50$ and $\sigma = 5.$ Perhaps you want

$$P(48 < X \le 52) = P(X = 49) + P(X = 50) + P(X = 51) + P(X = 52)\\ = P(X \le 52) -P(X \le 48) = F_X(52) - F_X(48) = 0.3091736,$$ where $F_X(\cdot)$ is the CDF of $X.$

If there are many integers in the desired interval, computation by hand can be tedious. In R statistical software dbinom denotes a binomial PDF and pbinom a binomial CDF.

The probability above could be evaluated in R as shown below. [The last value is a normal approximation (with continuity correction), which is often accurate to a couple of decimal places.]

 sum(dbinom(49:52, 100, .5))      # adding terms of the PDF
 ## 0.3091736
 diff(pbinom(c(48,52), 100, .5))  # subtracting two CDF values
 ## 0.3091736
 diff(pnorm(c(48.5,52.5), 50, 5)) # normal approximation
 ## 0.3093739

The figure below shows several values of the PDF of $Binom(100, .5),$ emphasizes the four probabilities required (heights of thick blue bars), and shows the approximating normal density curve. The normal approximation is the area beneath the curve between the vertical green lines.

enter image description here

Related Question