Solved – Quartiles in Excel

excelquantiles

I am interested in the definition of quartile that is usually used when you're in basic statistics. I have a Stat 101 type book and it just gives an intuitive definition. "About one quarter of the data falls on or below the first quartile…" But, it gives an example where it calculates Q1, Q2, and Q3 for the set of data

5, 7, 9, 10, 11, 13, 14, 15, 16, 17, 18, 18, 20, 21, 37

Since there are 15 pieces of data, it chooses 15 as the median, Q2. It then splits up the remaining data into two halves, 5 through 14, and 16 through 37. These each contain 7 pieces of data and they find the median of each of these sets, 10 and 18, as Q1 and Q3, respectively. This is how I would calculate it myself.

I looked at Wikipedia's article and it gives 2 methods. One agrees with the above, and one says you could also include the median 15 in both sets (but you wouldn't include the median if it was the average of the two middle numbers in the case of an even number of data points). This all makes sense to me.

But, then I checked Excel to see how Excel calculates it. I am using Excel 2010, which has 3 different functions. Quartile was available in 2007 and previous versions. It seems they want you to stop using this in 2010 but it's still available. Quartile.Inc is new but agrees exactly with Quartile as far as I can tell. And, there is Quartile.Exc as well. Both of the last 2 are new in 2010 I believe. This time, I just tried using the integers 1, 2, 3, …, 10. I'm expecting Excel to give median of 5.5, Q1 of 3, and Q3 of 8. The method from the statistics book, as well as both methods on Wikipedia would give these answers, since the median is the average of the middle two numbers. Excel gives

quartile number, Quartile.Inc, Quartile.Exc
1,               3.25,         2.75 
2,               5.5,          5.5
3,               7.75,         8.25

Neither of these agree with what I have previously talked about.

The descriptions in the help file for Excel are:

Quartile.Inc – Returns the quartile of a data set, based on percentile values from 0..1, inclusive.

Quartile.Exc – Returns the quartile of the data set, based on percentile values from 0..1, exclusive.

Can any one help me understand this definition Excel is using?

Best Answer

Typically, a rank $r$ (between $1$ and $n$ for $n$ data) is converted to a percent $p$ via the formula

$$p = 100\frac{r-\alpha}{n+1-2\alpha}$$

for some predetermined "plotting position" $\alpha$ between $0$ and $1$, inclusive. Solving for $r$ in terms of $p$ gives

$$r = (n+1-2\alpha) (p/100) + \alpha.$$

Excel has historically used $\alpha=1$ for its PERCENTILE and QUARTILE functions. The online documentation for QUARTILE.INC and QUARTILE.EXC is useless, so we have to reverse-engineer what these functions are doing.

For example, with data $(1,2,3,4,5,6,7,8,9,10)$, we have $n=10$ and $p \in \{25, 50, 75\}$ for the three quartiles. Using $\alpha=1$ in the preceding formula yields ranks of $9(0.25)+1 = 3.25$, $9(0.50)+1 = 5.5$, and $9(0.75)+1 = 7.75$, reproducing the results for QUARTILE.INC.

If instead we set $\alpha=0$ the corresponding ranks are $11(0.25) = 2.75$, $11(0.50) = 5.5$, and $11(0.75) = 8.25$, reproducing the results for QUARTILE.EXC.

Further testing on your part (I do not have a recent version of Excel) may establish the validity of my guess that these two versions of the quartile function are determined by these two (extreme) values of $\alpha$.

By the way, fractional ranks are converted into data values by means of linear interpolation. The process is explained and illustrated in my course notes at Percentiles and EDF Plots--look near the bottom of that page. There is also a link to an Excel spreadsheet illustrating the calculations.

If you would like to implement a general percentile function in Excel, here's a VBA macro to do it:

'
' Converts a percent, computed using plotting position constant A,
' into a percent appropriate for the Excel Percentile() and
' Quartile() functions.  (The default value of A for Excel is 1;
' most values in use are between 0 and 0.5.)
'
Public Function PercentileA(P As Double, N As Integer, A As Double) As Double
    If N < 1 Or A < 0# Or A > 1# Or P < 0# Or P > 1# Then
        Exit Function
    End If
    If N < 2 Then
        PercentileA = 0.5
    Else
        PercentileA = ((N - 2 * A + 1) * P + A - 1) / (N - 1)
    End If
End Function

It converts a nominal percent (such as 25/100) into the percent that would cause Excel's PERCENTILE function to return the desired value. It is intended for use in cell formulas, as in =PERCENTILE(Data, PercentileA(0.25, Count(Data), 0.5)).