Solved – Physical significance of heavy tail distributions

cumulative distribution functiondistributionsheavy-tailed

I have come across several usage of the term "heavy tail distributions", but unable to find some good resources in the Internet to answer few of my questions:

  1. Compared to what the tail of the distribution is said to be "heavy"?
  2. Given some data, and their CCDF plots (a log-log plot, say) how do we say its heavy? Simply by curve-fitting with a power-law CCDF?
  3. What is the physical significance (or effect) of the tail being heavy? For example, say the inter-arrival times follow a heavy tail distribution — what are its implications?

Best Answer

Wikipedia is often a reasonable start point for basic definitions. In this case there is an entry for heavy-tailed distributions.

The distribution of a random variable $X$ with distribution function $F$ is said to have a heavy right tail if $$\lim_{x\rightarrow\infty} e^{\lambda x}P[X>x]=\infty.$$

for all $\lambda>0$. This can be interpreted as: the tails decay slower than the exponential and this has implications on the existence of moments (see the same wikipedia entry).

Then,

  • (1). To the exponential or compared to an exponential-type behaviour.

  • (2). This can be empirically checked using a normal QQ-plot, for example,

x <- rt(1000,3)

qqnorm(x)

qqline(x,col="red")

As you can see, the lack of linear fit is observed in both tails.

  • (3). An immediate implication of the use of heavy-tailed distribution is that you observe more extreme observations or that the model can capture this sort of behaviour. This is, values far from the shoulders of the distribution. This is reflected in the summary statistics, for example compare the statistics of a normal sample and those of a $t$ sample with $3$ degrees of freedom

    summary(rnorm(1000))

    summary(rt(1000,3))

Related Question