I am interested in the definition of quartile that is usually used when you're in basic statistics. I have a Stat 101 type book and it just gives an intuitive definition. "About one quarter of the data falls on or below the first quartile…" But, it gives an example where it calculates Q1, Q2, and Q3 for the set of data
5, 7, 9, 10, 11, 13, 14, 15, 16, 17, 18, 18, 20, 21, 37
Since there are 15 pieces of data, it chooses 15 as the median, Q2. It then splits up the remaining data into two halves, 5 through 14, and 16 through 37. These each contain 7 pieces of data and they find the median of each of these sets, 10 and 18, as Q1 and Q3, respectively. This is how I would calculate it myself.
I looked at Wikipedia's article and it gives 2 methods. One agrees with the above, and one says you could also include the median 15 in both sets (but you wouldn't include the median if it was the average of the two middle numbers in the case of an even number of data points). This all makes sense to me.
But, then I checked Excel to see how Excel calculates it. I am using Excel 2010, which has 3 different functions. Quartile was available in 2007 and previous versions. It seems they want you to stop using this in 2010 but it's still available. Quartile.Inc is new but agrees exactly with Quartile as far as I can tell. And, there is Quartile.Exc as well. Both of the last 2 are new in 2010 I believe. This time, I just tried using the integers 1, 2, 3, …, 10. I'm expecting Excel to give median of 5.5, Q1 of 3, and Q3 of 8. The method from the statistics book, as well as both methods on Wikipedia would give these answers, since the median is the average of the middle two numbers. Excel gives
quartile number, Quartile.Inc, Quartile.Exc 1, 3.25, 2.75 2, 5.5, 5.5 3, 7.75, 8.25
Neither of these agree with what I have previously talked about.
The descriptions in the help file for Excel are:
Quartile.Inc – Returns the quartile of a data set, based on percentile values from 0..1, inclusive.
Quartile.Exc – Returns the quartile of the data set, based on percentile values from 0..1, exclusive.
Can any one help me understand this definition Excel is using?