Normal Distribution Model for Discrete Probability – Is It Possible?

normal distributionprobability

If one wants to find the probability that a continuous random variable will fall within a range of $a \leq X \leq b$, based on a mean value $\mu$, and a deviation of $\sigma$, he would integrate the normal distribution function:

$$\int^b_a \frac{e^{-\frac{(x-\mu)^2}{2\sigma^2}}}{\sigma\sqrt{2\pi}}dx$$

Since this is for continuous probability, is there an alternative to normal distribution for discrete probability?

Suppose $\mu = 100$, and $\sigma = 50$. For discrete probability, I would try to use bounds close together to achieve a similar, but still not the completely desired outcome. However, the probability is very low:

$$\int^{99.95}_{100.05}p(x)dx = 0.0008$$

The probability seems too low for this to be true, which suggests that a different model for discrete probability should exist.

Best Answer

Your ultimate goal is not clear. Perhaps I can flounder around and make some useful comments.

For appropriate choices of $n$ and $\theta,$ the distribution $Binom(n, \theta)$ is approximately normal, especially if $n$ is large and $\theta$ is not too far from 1/2. The mean is $\mu = n\theta$ and the variance is $\sigma^2 = n\theta(1-\theta).$

Also, for large enough $\lambda,$ the distribution $Pois(\lambda)$ is nearly normal. The mean and variance are $\mu = \lambda$ and $\sigma^2 = \lambda.$ However, the Poisson model may have less flexibility in matching what you want.

Of course, to find the probability that a random variable taking integer values lies in an interval $(a, b]$ you will add probabilities for integer values in that interval, rather than evaluating an integral.

For example, if $X \sim Binom(n = 100, \theta = 1/2),$ you have $\mu = 50$ and $\sigma = 5.$ Perhaps you want

$$P(48 < X \le 52) = P(X = 49) + P(X = 50) + P(X = 51) + P(X = 52)\\ = P(X \le 52) -P(X \le 48) = F_X(52) - F_X(48) = 0.3091736,$$ where $F_X(\cdot)$ is the CDF of $X.$

If there are many integers in the desired interval, computation by hand can be tedious. In R statistical software dbinom denotes a binomial PDF and pbinom a binomial CDF.

The probability above could be evaluated in R as shown below. [The last value is a normal approximation (with continuity correction), which is often accurate to a couple of decimal places.]

 sum(dbinom(49:52, 100, .5))      # adding terms of the PDF
 ## 0.3091736
 diff(pbinom(c(48,52), 100, .5))  # subtracting two CDF values
 ## 0.3091736
 diff(pnorm(c(48.5,52.5), 50, 5)) # normal approximation
 ## 0.3093739

The figure below shows several values of the PDF of $Binom(100, .5),$ emphasizes the four probabilities required (heights of thick blue bars), and shows the approximating normal density curve. The normal approximation is the area beneath the curve between the vertical green lines.

Related Solutions

[Math] If 100% of data fall within first two standard deviations of the mean, is the distribution Normal

The naive approach would be to expect 5 samples of 100 outside 2 SD, with a standard deviation on that of $\sqrt 5 \approx 2.2$, so getting 0 is only a 2 SD event (and maybe not that bad as the measured SD is probably a bit low).

[Math] Calculating probability for a normal distribution

You'll want to find the probability that someone works overtime more than $165$ minutes in a given week. You're on the right track with this. The probability of it occurring in a given week is

$$1-\Phi\left(\frac{165-135}{30}\right)\approx0.1587$$

> 1-pnorm((165-135)/30)
[1] 0.1586553

To find the probability of it happening in at least four of the five weeks, you'll want to make use of the binomial distribution:

$$P(X=k)={n \choose k}p^k(1-p)^{n-k}$$

We wish to find $P(X=4)+P(X=5)$ where $n=5$ and $p$ is the probability obtained above.

Best Answer

Related Solutions

[Math] If 100% of data fall within first two standard deviations of the mean, is the distribution Normal

[Math] Calculating probability for a normal distribution

Related Question