Terminology in Quantiles – Are There 99 or 100 Percentiles and Their Definitions

quantilesterminology

Are there 99 percentiles, or 100 percentiles? And are they groups of numbers, or divider lines, or pointers to individual numbers?

I suppose the same question would apply for quartiles or any quantile.

I have read that the index of a number at a particular percentile(p), given n items, is i = (p / 100) * n

That suggests to me that there are 100 percentiles.. because supposing you have 100 numbers(i=1 to i=100), then each would have an index(1 to 100).

If you had 200 numbers, there'd be 100 percentiles, but would each refer to a group of two numbers. Or 100 dividers excluding either the far left or far right divider 'cos otherwise you'd get 101 dividers. Or pointers to individual numbers so the first percentile would refer to the second number, (1/100)*200=2 And the hundredth percentile would refer to the 200th number (100/100)*200=200

I have sometimes heard of there being 99 percentiles though..

Google shows the oxford dictionary that says of percentile- "each of the 100 equal groups into which a population can be divided according to the distribution of values of a particular variable." and "each of the 99 intermediate values of a random variable which divide a frequency distribution into 100 such groups."

Wikipedia says "the 20th percentile is the value below which 20% of the observations may be found" But does it actually mean "the value below or equal to which, 20% of the observations may be found" i.e. "the value for which 20% of the values are <= to it". If it were just < and not <=, then By that reasoning, the 100th percentile would be the value below which 100% of the values may be found. I have heard that as an argument that there can be no 100th percentile, because you can't have a number where there are 100% of the numbers below it. But I think maybe that argument that you can't have a 100th percentile is incorrect and is based an error that the definition of a percentile involves <= not <. (or >= not >). So the hundredth percentile would be the final number and would be >= 100% of the numbers.

Best Answer

Both of these senses of percentile, quartile, and so on are in widespread use. It’s easiest to illustrate the difference with quartiles:

  1. the “divider” sense — there are 3 quartiles, which are the values dividing the distribution (or sample) into 4 equal parts:

       1   2   3
    ---|---|---|---
    

    (Sometimes this is used with max and min values included, so there are 5 quartiles numbered 0–4; note this doesn’t conflict with the numbering above, it just extends it.)

  2. the “bin” sense: there are 4 quartiles, the subsets into which those 3 values divide the distribution (or sample)

     1   2   3   4
    ---|---|---|---
    

Neither usage can reasonably be called “wrong”: both are used by many experienced practitioners, and both appear in plenty of authoritative sources (textbooks, technical dictionaries, and the like).

With quartiles, the sense being used is usually clear from context: speaking of a value in the third quartile can only be the “bin” sense, while speaking of all values below the third quartile most likely means the “divider” sense. With percentiles, the distinction is more often unclear, but it’s also not so significant for most purposes, since 1% of a distribution is so small — a narrow strip is approximately a line. Speaking of everyone above the 80th percentile might mean the top 20% or the top 19%, but in an informal context that’s not a major difference, and in rigorous work, the meaning needed should be presumably clarified by the rest of the context.

(Parts of this answer are adapted from https://math.stackexchange.com/questions/1419609/are-there-3-or-4-quartiles-99-or-100-percentiles, which also gives quotations + references.)

Related Question