Solved – Plot Pareto tails in QQ-plot for log-normal distributions

kolmogorov-smirnov testlognormal distributionpower lawqq-plotr

I'm working on samples that I'm trying to fit into log-normal distributions. In some cases, Kolmogorov-Smirnov test statistics is something like D = 0.0056 with an associated p-value of 0. Hence, my sample shows very small departures from a theoretical log-normal distribution, but looking at the p-value I reject the null hypothesis that that my sample is drawn from the reference distribution (log-normal).

KS-test is performed through R code:

sample.z <- std(sample) # I standardize data to allow for comparisons
LN <- rlnorm(1e5, 0, 1) # theoretical lognormal with mean = 0 and sigma = 1
ks.test(sample.z, LN)

Looking at the QQ-plot I see significant departures in the tails of my distributions. Thus, I was thinking that maybe I'm getting these results because of Pareto tails in my approximately log-normal distributions. Indeed,

library(igraph)
power.law.fit(sample)

confirms this hypothesis by identifying a Pareto tail after a certain lower bound (xmin) with a certain alpha.

Now, I would like to show the fit of my Pareto tail in the QQ-plot. How can I do that? Can you suggest other data-visualization methods to stress out the presence of Pareto tails in log-normal distributions?

PDFs

CDFs
fitdist output

Best Answer

If you take logs, it should be normal with exponential tail

Just do a normal and an exponential qq plot of the data, the first should be roughly linear before the kink, the second roughly linear after the kink:

enter image description here

(In this case the change point was at 5.5, and we see what we should - a kink near 5.5, and the first plot roughly linear before and the second roughly linear after the kink. The fact that the first plot looks roughly linear after the kink as well suggests that the Pareto data might in this particular example have been reasonably approximated by a second lognormal.)

Related Question