Solved – How to get any quantiles given median value and margin of error

inferencequantilessample-sizesampling

I am trying to get the values of the 25th and 75th quantile of the population based on two values that summarizes the samples:

  • median value
  • 90 percent margin of error

I don't have any other information including the sample size, standard error, etc.. And I think it 's safe to assume the samples were drawn from a normal distribution.

The 90 percent margin of error in the original document is described as follows:

The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value.

Edit: added the description of the margin of error to clarify the question.

Best Answer

Assume a sample size of 200, with mean (mu) = 20, and standard deviation (sigma) = 10.

import numpy as np

mu, sigma = 20, 10 # mean and standard deviation
s = np.random.normal(mu, sigma, 200)

np.quantile(s, 0.25)
np.quantile(s, 0.75)

I'm using Python for this example, but you can see that we are:

1) generating an array of 200 normally distributed random numbers

2) Obtaining the 25th and 75th quantile.

>>> np.quantile(s, 0.25)
11.700325588242732
>>> np.quantile(s, 0.75)
26.11671871467393

Now, when you say "90% margin of error", I am assuming you mean a 90% "confidence interval". In this case, your margin of error is 10%.

Using the scipy library (also from Python), we can obtain a 90% confidence interval as follows:

from scipy import stats
stats.norm.interval(0.90, loc=mu, scale=s)

More detail can be found on the above here.

You can now see that we generate an array where the values would fall within the 90% confidence interval:

>>> stats.norm.interval(0.90, loc=mu, scale=s)
(array([-20.1017426 , -50.41395259, -15.74140484, -34.9162548 ,
       -14.55505407, -26.20186343,  -8.38349335, -28.15329328,
............
         0.3405667 ,  14.1913693 , -44.18605464, -18.30478346]), 
array([60.1017426 , 90.41395259, 55.74140484, 74.9162548 , 54.55505407,
       66.20186343, 48.38349335, 68.15329328, 42.42820445, 70.17147704,
............
       55.23983044, 41.10373296, 51.30638793, 57.20990033, 47.99641712]))

The above is obviously dependent on which software you are using and what dataset you are working with, but hopefully you might find these guidelines useful.

Related Question