Convention of finding Quartiles

medianstatistics

Find the quartile deviation for the data
$$
\begin{array}{|c|c|c|c|}
\hline
x& 2 & 3 & 4&5&6 \\ \hline
f& 3 & 4 & 8&4&1\\ \hline\end{array}
$$

My Attempt
$$
\begin{array}{|c|c|c|c|}
\hline
x& 2 & 3 & 4&5&6 \\ \hline
f& 3 & 4 & 8&4&1\\ \hline
F& 3 & 7 & 15&19&20\\ \hline
\end{array}
$$

$$
Median=\frac{T_{10}+T_{11}}{2}=4\\
Q_1=\frac{T_{5}+T_{6}}{2}=3\\
Q_3=\frac{T_{15}+T_{16}}{2}=\frac{4+5}{2}=4.5\\
Q.D=\frac{Q_3-Q_1}{2}=\frac{4.5-3}{2}=\frac{1.5}{2}=0.75
$$

But my reference gives the solution $1$ and $Q_3=5$, is it really because of the convention in which the quartiles are taken ?

Which one is correct ?

Note: I also tried few online calculators for finding quartiles with a different data
$$
2,4,4,5,5,6,7,7,7,8,8,9
$$

which are giving different values for the quartiles, please check link 1 and link 2

link 1

enter image description here

Best Answer

About 10 slightly different rules for defining quartiles are in common use and a few more are occasionally used in particular fields of study. Mostly, the differences are noticeable in small sample sizes. R statistical softwar permits one to choose the type of quartile.

Here is a sample of $n=13$ observations rounded to one decimal place.

set.seed(601)
x = round(rnorm(13, 20, 3), 1)
sort(x)
 [1] 14.8 15.2 16.3 18.5 19.1 19.2 19.2 19.6 19.9 20.4 21.5 22.0 25.5

Without extra parameters, the quantile function in R give min, lower quartile, median, upper quartile, and max, using what R calls type 7 quantiles.

quantile(x)
  0%  25%  50%  75% 100% 
14.8 18.5 19.2 20.4 25.5 

Other types give various different results:

quantile(x, type=3)
  0%  25%  50%  75% 100% 
14.8 16.3 19.2 20.4 25.5 
quantile(x, type=4)
    0%    25%    50%    75%   100% 
14.800 16.850 19.200 20.275 25.500 
quantile(x, type=5)
    0%    25%    50%    75%   100% 
14.800 17.950 19.200 20.675 25.500 
quantile(x, type=6)
   0%   25%   50%   75%  100% 
14.80 17.40 19.20 20.95 25.50 
quantile(x, type=8)
      0%      25%      50%      75%     100% 
14.80000 17.76667 19.20000 20.76667 25.50000 

And so on, for a few more types. Each type is supposed to have its own advantages in various circumstances.

For beginning students, my suggestions are for quantiles:

  • Don't be surprised if software gives a slightly different result than your text.

  • Don't be surprised if different software programs give sightly different results.

  • Learn the definition in your text or class notes, and use it during your class.

  • Remember that differences are small, but noticeable, for small datasets. But for large datasets (where quantiles are most often used) the differences, if any, are seldom important.

Examples for a sample of 1000, rounded to 2 places.

set.seed(2020)
y = round(rnorm(1000, 20, 3), 2)
quantile(y, type=6)
     0%     25%     50%     75%    100% 
10.5000 17.8600 19.8300 21.9175 31.1100 
quantile(y)
     0%     25%     50%     75%    100% 
10.5000 17.8600 19.8300 21.9125 31.1100 
quantile(y, type=8)
      0%      25%      50%      75%     100% 
10.50000 17.86000 19.83000 21.91583 31.11000 
Related Question