Solved – Problem with Pareto distribution and R

pareto-distributionr

I am trying to test this property of pareto distribution: Let f(x) be a pareto distribution

$$
f(x)=\alpha \frac{x_m^\alpha}{x^{\alpha+1}}
$$

so we have the cdf that is

$$
CDF(x)=\int_{x_m}^{x}\alpha \frac{t_m^\alpha}{t^{\alpha+1}}dt=1-\frac{x_m^\alpha}{x^\alpha}
$$

then the probability that $x>x_0$ is

$$
P(x>x_0)=1-CDF(x)=\frac{x_m^\alpha}{x^\alpha}
$$

and so we have

$$
\frac{P(x>x_0)}{f(x)}=\frac{x}{\alpha}
$$

Now i am trying to test it with R.

 library(PtProcess)
 dd<-rpareto(10000,1.5,0.01)
 cdf<-ecdf(dd)
 df<-density(dd)
 ff<-(1-cdf(df$x))/df$y

If i plot ff

 plot(df$x,ff)

I do not obtain the correct straight line. I guess that this is due at the way density() and ecdf() works. I need this form of the test (an a posteriori evaluation of fd and cdf) in order to perform the same test on a sample of data of unknown orgin. I guess that i need a way to binning the ecdf() function in the same way as hist() is the binning version of density.

So my question is:

  • Does there exist an equivalent binned function of ecdf() as hist() is the binned function of density()?
  • or can I simulate ecdf() with hist()?

Best Answer

By using ecdf and density, you're not actually doing the Pareto calculations, but instead using estimates based on a sample that are, by their non-parametric nature, not guaranteed (read: not going to) have the desired property.

Try the following:

x <- seq(0.1,10,by=0.1)
fx <- dpareto(x, 1.5, 0.05)
Fx <- ppareto(x, 1.5, 0.05)
plot((1-Fx)/fx ~ x)

You'll get the nice straight line out: enter image description here

Related Question