[Math] “Standard error” of a sample’s 90th percentile for a normally distributed population

probability

When sampling from a normally distributed population, I understand that the expected deviation between the sample mean and the population mean can be calculated using the standard error

$$ \text{standard error} = \frac{\sigma_{\text{population}}}{\sqrt{n}}$$

Is there a way to calculate the expected deviation between a sample's 90th percentile and the population's true 90th percentile?

Edit:
Here's my attempt to formalize this idea:

$\sigma= \frac{\sum_i^n{((\pi_{90}^*-\pi_{90})^2})}{n}$ where $ \pi_{90} $ is the truly such that/$\pi_{90}^*$ is the sample value such that $ P(f(X) < \pi_{90}) = 0.9 $

My question is: "Can $\sigma$ be expressed in terms of $\sigma = g(f(X))$," where g is some mapping from f's formulation to a description of how $\sigma$ scales with X? I realize that there may be different answers for different types of PDFs – I'm curious if this can be solved for any specific PDF (uniform, Gaussian, or whatever else lends itself well to the mathematics).

Best Answer

Using the form from the websites mentioned in the comments, we have sample size N=10n+9 for any positive integer n, and k=9n+9 for the same positive integer n. Also $f(x)$ is the normal PDF and $F(x)$ is the normal CDF.

The 90th sample percentile has PDF for a sample size $10n+9$ of:

$$\binom{10n+9}{9n+8,1,n}(F(x))^{9n+9}f(x)(1-F(x))^n=\frac{(10n+9)!}{(9n+8)!n!}(F(x))^{9n+9}f(x)(1-F(x))^n$$

If we take the binomial approximation for the factors which are polynomials in $F(x)$, we get:

$$\approx \frac{(10n+9)!}{(9n+8)!n!}(1-(9n+9)(1-F(x)))f(x)(1-nF(x))=\frac{(10n+9)!}{(9n+8)!n!}((9n+9)F(x)-9n-8)f(x)(1-nF(x))=\frac{(10n+9)!}{(9n+8)!n!}[-(9n+9)F(x)^2+(9n^2+17n+9)F(x)-(9n+8)]f(x)$$

We approximate further by throwing out all terms which are not $\Theta(n^2)$ (ultimately we will be taking the large n limit, so this approximation seems at least plausible):

$$\approx \frac{(10n+9)!}{(9n+8)!n!}(9n^2F(x))f(x)$$

Approximating even further, we now make use of Stirling's Formula:

$$\approx \frac{\sqrt{2\pi(10n+9)}(\frac{10n+9}{e})^{10n+9}}{2\pi\sqrt{(9n+8)n}(\frac{9n+8}{e})^{9n+8}(\frac{n}{e})^n}(9n^2F(x))f(x)=\frac{\sqrt{2\pi(10n+9)}(10n+9)^{10n+9}}{e\sqrt{(9n+8)n}(9n+8)^{9n+8}(n)^n}(9n^2F(x))f(x)$$

Neglecting all parts of factors which are not at least $\Omega(n)$ and attempting to cancel terms of similar orders (this basically has no analytical justification whatsoever, but none of these approximations actually simplify to anything ):

$$\approx \frac{\sqrt{2\pi}}{e}\frac{10}{9}F(x)f(x) $$

which isn't even a probability density function.

Hence, as the comments above point out, while such an expectation for the deviation from the 90th sample percentile does theoretically exist, even using incredibly aggressive approximations, it is very difficult to identify even an approximate simple closed form for it that holds for arbitrarily large sample sizes (i.e. in the large n limit, and thus which is independent of the actual sample size).

Best Answer

Related Solutions

[Math] Sample standard deviation and population standard deviation

Related Question