Solved – Calculate standard deviation given mean and percentage

normal distributionstandard deviation

Some values have a normal distribution with mean .0276. What standard deviation is required so that 98% of values are between .0275 and .0278?

What I'm confused with is how to calculate the standard deviation when Z is between two intervals. I know that P(-.0001/σ < Z < .0002/σ) = .98, but I don't know where to go from here.

Best Answer

We can solve this problem almost instantly in our heads using the "68-95-99.7" rule. I will explain the process in detail because that is what matters. The answer is of little interest: the point to this question is to help us learn to think about probability distributions.

These numbers in the 68-95-99.7 rule are (approximately) the percent chances that a Normal variable lies within one, two, and three standard deviations of its mean. By subtracting these numbers from 100% it follows that the chances of a Normal variable lying beyond one, two, and three SDs of its mean are about 32, 5, and 0.3 percent, respectively. Since this distribution is symmetric, we can split each of these numbers in half to find the chances of lying beyond one, two, and three SDs of the mean in a given direction: the values are about 16, 2.5, and 0.15 percent, respectively. (Slightly more accurate values are shown in the figure.)

Figure: Standard Normal Density

The figure uses areas to represent chances. The leftmost value of 16%, for instance, is the proportion of all the area under the curve that lies to the left of -1. The "tail areas" associated with the numbers $Z = -3,-2,-1, 1,2,3$ are labeled. (These areas overlap; for instance, the 16% values include regions accounted for by the 2.3% and 0.13% values.)

People who think effectively about probabilities use mental figures like this one.

Turn to the data in the question: 0.0275 is 0.0001 to the left of the mean of 0.0276 while 0.0278 is 0.0002 to the right of the mean: twice as far. We therefore need to enclose 98% of the probability between an unknown number of standard deviations to the left of the mean--call this multiple $-Z$ to indicate it's to the left--and twice that number of standard deviations to the right of the mean, which therefore is $2Z.$

Equivalently, 100 - 98 = 2% of the probability must lie beyond this range. The figure shows 2.3% of the probability lies to the left of $-Z=-2$ and essentially 0% lies to the right of $Z=2\times 2=4,$ so $Z=2$ would be an accurate guess (albeit a tad low).

The only arithmetic needed to get to this point involved subtractions, one division (of 0.0002 / 0.0001) and halving.

If you would like to get a little closer to "the" answer, look up (or compute) the value of $Z$ for which 2% of the probability is to the left of $-Z$: that's $Z=2.054.$ It's still the case that essentially 0% is to the right of $2Z \approx 4.1.$ (Because there actually is a tiny bit of probability beyond $4.1,$ the correct value of $Z$ must be just a tiny bit more than $2.054.$)

Either way, we come up with the result that $Z$ is somewhere around $2$ or $2.054.$

Finally, return to the data in the problem: $Z$ standard deviations equals $0.0001$ (or $2Z$ standard deviations equals $0.0002:$ it's all the same). Our answers therefore are

  • Quick and dirty, based on the 68-95-99.7 rule: $0.0001/2 = 0.00005.$

  • A little more refined, based on a table lookup: $0.0001/2.054 \approx 0.0000486\,91.$

We know both of these answers will be a little too large, but the second must be quite accurate.


Having gone through this thought process, we could write down the following R commands immediately because they directly carry out the calculation (albeit more accurately):

(Z <- uniroot(function(z) pnorm(2*z)-pnorm(-z) - 0.98, c(2,3))$root)

2.054 158

That agrees with the three decimal digit table I used to get $2.054.$

(0.0276 - 0.0275) / Z

4.86 8176e-05

It agrees with our first answer almost to two significant figures and with the second answer almost to four significant figures--more than we really deserve.