Convention of finding Quartiles

medianstatistics

Find the quartile deviation for the data
$$
\begin{array}{|c|c|c|c|}
\hline
x& 2 & 3 & 4&5&6 \\ \hline
f& 3 & 4 & 8&4&1\\ \hline\end{array}
$$

My Attempt
$$
\begin{array}{|c|c|c|c|}
\hline
x& 2 & 3 & 4&5&6 \\ \hline
f& 3 & 4 & 8&4&1\\ \hline
F& 3 & 7 & 15&19&20\\ \hline
\end{array}
$$
$$
Median=\frac{T_{10}+T_{11}}{2}=4\\
Q_1=\frac{T_{5}+T_{6}}{2}=3\\
Q_3=\frac{T_{15}+T_{16}}{2}=\frac{4+5}{2}=4.5\\
Q.D=\frac{Q_3-Q_1}{2}=\frac{4.5-3}{2}=\frac{1.5}{2}=0.75
$$

But my reference gives the solution $1$ and $Q_3=5$, is it really because of the convention in which the quartiles are taken ?

Which one is correct ?

Note: I also tried few online calculators for finding quartiles with a different data
$$
2,4,4,5,5,6,7,7,7,8,8,9
$$
which are giving different values for the quartiles, please check link 1 and link 2

Best Answer

About 10 slightly different rules for defining quartiles are in common use and a few more are occasionally used in particular fields of study. Mostly, the differences are noticeable in small sample sizes. R statistical softwar permits one to choose the type of quartile.

Here is a sample of $n=13$ observations rounded to one decimal place.

set.seed(601)
x = round(rnorm(13, 20, 3), 1)
sort(x)
 [1] 14.8 15.2 16.3 18.5 19.1 19.2 19.2 19.6 19.9 20.4 21.5 22.0 25.5

Without extra parameters, the quantile function in R give min, lower quartile, median, upper quartile, and max, using what R calls type 7 quantiles.

quantile(x)
  0%  25%  50%  75% 100% 
14.8 18.5 19.2 20.4 25.5

Other types give various different results:

quantile(x, type=3)
  0%  25%  50%  75% 100% 
14.8 16.3 19.2 20.4 25.5 
quantile(x, type=4)
    0%    25%    50%    75%   100% 
14.800 16.850 19.200 20.275 25.500 
quantile(x, type=5)
    0%    25%    50%    75%   100% 
14.800 17.950 19.200 20.675 25.500 
quantile(x, type=6)
   0%   25%   50%   75%  100% 
14.80 17.40 19.20 20.95 25.50 
quantile(x, type=8)
      0%      25%      50%      75%     100% 
14.80000 17.76667 19.20000 20.76667 25.50000

And so on, for a few more types. Each type is supposed to have its own advantages in various circumstances.

For beginning students, my suggestions are for quantiles:

Don't be surprised if software gives a slightly different result than your text.
Don't be surprised if different software programs give sightly different results.
Learn the definition in your text or class notes, and use it during your class.
Remember that differences are small, but noticeable, for small datasets. But for large datasets (where quantiles are most often used) the differences, if any, are seldom important.

Examples for a sample of 1000, rounded to 2 places.

set.seed(2020)
y = round(rnorm(1000, 20, 3), 2)
quantile(y, type=6)
     0%     25%     50%     75%    100% 
10.5000 17.8600 19.8300 21.9175 31.1100 
quantile(y)
     0%     25%     50%     75%    100% 
10.5000 17.8600 19.8300 21.9125 31.1100 
quantile(y, type=8)
      0%      25%      50%      75%     100% 
10.50000 17.86000 19.83000 21.91583 31.11000

Related Solutions

[Math] Is it possible to calculate the mean and standard deviation from a median and quartiles

It's mathematically impossible to deduce mean or standard deviation from median/quartiles, because medians and quartiles discard most of the data on which the mean and standard deviation are based.

Example:

data   frequency  
   0       50      
 1.4        4     
   2       50

That has a mean of 1.0 and standard deviation of 0.9. (I'm using 2 significant figures so I don't have to go into population versus sample standard deviation.)

data     frequency    
   0       30        
 1.4       44        
   2       30

That data also has the median and quartiles the same as in your example, but now the mean is 1.2 and the standard deviation is 0.8.

data     frequency        
   0       30        
 1.4        3        
   2       70        
10000000    1

Now I've changed my maximum without changing the median or quartiles, you can see even more clearly how the median and quartiles exclude extreme data, because the mean is now 96000 and the standard deviation is 98000 (still 2 sig.fig.).

[Math] How to calculate lower & upper quartiles

There are 24 values, and you have already ordered them. The 12th and 13th values are 5.00, so the median is indeed 5.00.

The lower quartile is indeed the median of the lower half. Since there are 12 numbers in the lower half, the median is the average of the 6th and the 7th one, which is exactly 1.745. If there were an odd number of items, you could simply take the middle one. (Typically the middle one of an even number of entries is taken to be the average of the two middle ones.)

I leave it to you to check that this method gives the answers you are supposed to get.

Best Answer

Related Solutions

[Math] Is it possible to calculate the mean and standard deviation from a median and quartiles

[Math] How to calculate lower & upper quartiles

Related Question