Solved – Approximate variance for 99.5th percentile for normal distribution

normal distributionorder-statisticsquantiles

My question is similar to this one:
Approximate order statistics for normal random variables

I am looking to find a formula for the variability of an arbitrary percentile of a normal distribution. The question cited does not quite solve it:
– that is concerned with min and max only
– I don't understand what n is

In particular, I am looking for the variability of the 99.5th percentile of a sample from a normal distribution.

Please, could you point me to a formula (with a reference) or a referred paper.

 Edits per Glen:

a) How I am collecting the samples seems irrelevant: the stdDev sample mean from normal dist is $\sigma / \sqrt{n}$ … irrespective of anything. Should the variance of a percentile be independent of method of collection.

b) A direct derivation might be acceptable – given how quickly/frequently URLs change, I don't know whether it is sensible to cite stats.stackexchange as a source.

Best Answer

I will try to answer without formulas many formulas because when we talk about percentile (you can google ORDER STATISTICS) the pdfs become pretty messy. I just to give you the main concepts.

  1. We are talking about estimators of a quantile, in your case the 99.5th percentile.

  2. Estimators are Random Variables and hence have moments.

  3. You want to find the Variance of the sample 99.5th percentile from a Normal RV

The most rigorous approach I think is to evaluate an integral that is:

Suppose we call $T$ the estimator of 99.5th percentile from a Normal RV:

$\sigma^2(T) = {\displaystyle \int_{-\infty}^{+\infty} } \big(T-E(T)\big)^2f_T(t)dx$

where $E(T) = \mu(T) = {\displaystyle \int_{-\infty}^{+\infty} } Tf_T(t)dx$

As I said before $f_T(t)$ is pretty messy and you won't be able to find a close formula for the integral. Consequently you are going to evaluate the integral numerically. Just to give you an idea of the generic pdf for the $k$th ORDER STATISTIC here is what you get:

$f_{T_{(k)}}(t) =\frac{n!}{(k-1)!(n-k)!}[F_T(t)]^{k-1}[1-F_T(t)]^{n-k} f_T(t)$

So what should you do? In you question talk about approximation.

The easiest way to go about this is bootstrap. The steps are simple if we want a non sophisticated way to get some results:

  1. From your ORIGINAL SAMPLE of size n calculate $\hat{\mu}$ the sample mean and ** $\hat{\sigma^2}$ the sample variance**.

  2. Calculate the 99.5th percentile from the original sample.

  3. Resample as many times as you want a sample of size n from a Normal distribution with mean $\hat{\mu}$ and variance $\hat{\sigma^2}$.

  4. For each resample calculate the sample 99.5th percentile and store it in a vector.

  5. Calculate the sample variance of this vector.

This is your approximate variance for the 99.5th percentile.