I have a skewed normal distribution for which I know the average, standard deviation, skewness & kurtosis (which is different from zero).
Given a number $X,$ how can estimate which percentile corresponds to that value? (I'm ok with getting an approximate value of this percentile.)
I used z-score tables in the past (before having skewed distributions), but they seem to apply only to non-skewed distributions.
Thanks for your help.
Best Answer
The Wiki page Skew normal distribution provides the information to estimate the parameters using the sample mean ($\bar{x}$), standard deviation ($s$), and skewness ($\hat{\gamma}$). The 3 parameters to be estimated are $\mu$, $\sigma$, and $\alpha$.
If $|\hat{\gamma}|<1$, then $\hat{\alpha}$ is found in two steps:
$$\delta =\sqrt{\frac{\pi \left| \hat{\gamma} \right| ^{2/3}}{2 \left(\left| \hat{\gamma} \right| ^{2/3}+\left(\frac{4-\pi }{2}\right)^{2/3}\right)}}$$
$$\hat{\alpha} = \text{sgn}(\hat{\gamma})\sqrt{\frac{\delta }{1-\delta ^2}}$$
Otherwise $\hat{\alpha}$ is the solution to
$$\hat{\gamma} =\frac{\sqrt{2} (4-\pi ) \hat{\alpha} ^3}{\left((\pi -2) \hat{\alpha} ^2+\pi \right)^{3/2}}$$
which needs to be performed numerically. Then $\hat{\mu}$ and $\hat{\sigma}$ are
$$\hat{\sigma} =\frac{s}{\sqrt{1-\frac{2 \hat{\alpha} ^2}{\pi \left(\hat{\alpha} ^2+1\right)}}}$$
$$\hat{\mu} =\bar{x}-\frac{\sqrt{\frac{2}{\pi }} \hat{\alpha} \hat{\sigma} }{\sqrt{\hat{\alpha} ^2+1}}$$
Now armed with the parameter estimates, then one can estimate the cumulative distribution function:
$$Pr(X \le x)=\Phi\left(\frac{x\, -\hat{\mu} }{\hat{\sigma} }\right)-2 T\left(\frac{x\, -\hat{\mu} }{\hat{\sigma} },\hat{\alpha} \right)$$
where $T$ is the Owen's T function: $T(x,a)=\frac{\int_0^a \frac{\exp \left(-\left(\left(t^2+1\right) x^2\right)\right)}{2 \left(t^2+1\right)} \, dt}{2 \pi }$ .
Here is an implementation using Mathematica:
I know code should be given as text but in this case because it is unlikely that you have Mathematica (and it would look much messier as text), it should be instructive as to the process.
To estimate the percentage of the distribution no larger than a specified value you'll need to use the cumulative distribution function (CDF) described on the Wiki page. Using Mathematica for values of $X$ being 7.5 and 5.2:
If you have access to the statistical package R, then the
sn
package will perform these calculations.