Is it true that for some datasets certain percentiles don’t exist

descriptive statisticspercentileproof-verification

Here is my reasoning:

Let's assume the exclusive definition of a percentile here.
Suppose we have a set of numbers {10,20,30,40} and we want to calculate value of its 90th percentile. But such value doesn't seem to exist. If we take a number from (30;40] interval, then it will be a 75th percentile because 3/4=0.75. We can't take a number that is >40. So value of 75th percentile is the closest one to value 90th percentile that we can get.

Now let's try the inclusive definition of a percentile on the same set of numbers in order to see if 90th will exist. In this case any number from interval [30;40) would be a 75th percentile. For the inclusive definition of a percentile 100th percentile DOES exist and we will get it if we choose 40, although we still can't choose any number >40. So 100th percentile is closest we will get to a 90th percentile in this case.

It shows that we can't get a 90th percentile in both cases. Does it mean that set of numbers {10,20,30,40} has NO 90th percenile? And consequently, that for some datasets certain percentiles don't exist?

Best Answer

It depends on the definition of the $n$th percentile. You could define it as the value for which exactly $n\%$ of the data is below it. For this dataset, any percentile other than 25, 50, 75, and 100 is undefined. You could also define it as the smallest value in the list greater than or equal to $n\%$ of the dataset. Then the 90th percentile of $\{10,20,30,40\}$ is $40$. It is also the 100th percentile and the 76th percentile.

But the real problem here is that percentiles aren't very meaningful for small datasets. Why would you want to talk about the 90th percentile of a 4-element dataset?

Related Question