[Math] how to find The median of a Grouped data when the sum of the frequency is odd

medianstatistics

how to find The median of a Grouped data when the sum of the frequency is odd? Can anyone explain me with an example. I have searched a lot. But each example in the internet is done with even number. I want to know what to do for the case of odd.

Best Answer

It is useful to identify the interval that contains the median, and that can be done without making unwarranted assumptions. For example, consider the $n = 25$ scores below, which have been sorted from smallest to largest.

 75    76    80    82    88    90    91    92    94    95
 99   100   102   103   103   105   106   107   109   113   
113   115   116   116   119

Their exact median is $H = 102,$ the thirteenth observation in the sorted list.

A frequency histogram below based on the following cutpoints (bin boundaries):

74.5, 84.5, ..., 114.5, 125.5.

These boundaries were chosen so that no (integer) score can fall exactly on a boundary.

The five interval midpoints are 79.5, 89.5, 99.5, 109.5, and 119.5. We can see from this frequency histogram below, that the corresponding frequencies are 4, 5, 6, 6, and 4.

Just looking at the histogram (or at a table of interval boundaries, midpoints, and frequencies), and without knowledge of the exact values of the $n = 25$ observations, all we can say about the median is that it falls in the interval $(94.5, 104.5)$ with midpoint $99.5$ and frequency $6.$ This interval is called the median interval.

In practice, grouped data tables and histograms are used mainly for samples that are at least moderately large. For a large sample it would ordinarily be sufficient to say that the median falls in the interval with midpoint $99.5.$

enter image description here

A favorite exercise in elementary statistics books is to try to approximate the exact value of the median from a histogram or from grouped data. Doing so requires one to make the assumption (seldom true) that the observations within the median interval are evenly spaced (or uniformly distributed).

One formula for approximating the exact median $H$ is

$$ H = L + \frac{w}{f_m}(.5n - cf_b),$$

where $L = 94.5$ is the lower limit of the median interval, $f_m = 6$ is the frequency of the median interval, $cf_b = 9$ is the number of observations in intervals below the median interval, $w = 10$ is the (common) interval width, and $n = 25$ is the total sample size. This kind of formula is sometimes called an 'interpolation' formula.

For our data,

$$ H = 94.5 + (10/6)(25/2 - 9) = 100.3333.$$

This procedure is seldom used in serious statistical analysis, and formulas for it can differ a bit from one textbook to another. I do not know the formula in your book, or why you wonder about a distinction between even and odd sample sizes $n$.

I hope this answer is helpful. If you are using a different formula to approximate the median, or if you have further questions, please leave me a Comment and edit your Question to be a little more specific. Then perhaps one of us can be of further help.

Notes: (1) The 25 observations are simulated from $\mathsf{Norm}(\mu = 100,\,\sigma = 15)$ and rounded to integers. So the median of the population from which the data were drawn is $\eta = 100.$ (2) It is not usually a good idea to 'group' datasets with $n$ as small as 25, or to make histograms of such small datasets. I chose this particular illustration because I thought it would make the application of the interpolation formula easy to follow.