Solved – Looking for a long-tail distribution with mean=1

distributionsfat-tailsmathematical-statistics

I would like to generate random numbers $X$'s from a desired distribution whose properties should meet the following requirements:

  1. $X \in [0, \infty) $

  2. The mean of the r.v. is around 1, i.e., $\mathbb{E}[X] \approx 1$

  3. The distribution shows "long tail". "Long tail" in the sense that satisfies the typical description: https://en.wikipedia.org/wiki/Long_tail

  4. To be more quantitative, let's say at least $P(X \gt 5) = 0.1$

  5. Although it's possible to combine multiple distributions to achieve the above goals, I was looking for a single distribution that can be written as a compact, closed-form probability density function.

In words, I was looking for a distribution whose mode or mean is around 1 and has fat tail that extends to large values. Can you suggest a distribution that satisfies these properties?

(It's possible that this is naive and no such distribution exists.)

The closest candidate came to my mind is $\chi^2$ distribution, which is controlled by the parameter $k$. However, it's either the mean is too high or the tail probability is too low. Below is an example of $\chi^2(k=3)$. Ideally I would like to move the mean to 1, and make the tail "fatter".

enter image description here

A use case would be to use this distribution as a random number generator, such that the mean of the generated numbers is around 1 while being able to generate large numbers.


Just wanted to point out that, although what @stans suggested to choose log-normal distribution with $\mu = -\sigma^2/2$ satisfies the requirement of $\mathbb{E}[X] = 1$, it doesn't create enough tail probability.

In fact, in order to satisfy the mean=1 condition, $\mu$ needs to be shifted to the very left so that the tail probability $P(X>5)$ gets squeezed smaller. Doing a grid search in the range $\sigma \in [1, 8]$, it seems that the largest tail probability happens around $\sigma=1.79$, at which $P(x>5) \approx 0.036$

enter image description here

Python code to generate log-normal distribution and the corresponding $P(X>5)$:

import numpy as np
import scipy.stats

sigma = 1
mu = -0.5 * sigma**2

s = sigma  
scale = np.exp(mu)

tail_prob = 1.0 - scipy.stats.lognorm(s=s, scale=scale).cdf(5)

Best Answer

Log-normal for the right choice of $\mu$ and $\sigma$. In other words, if $X$ ~ $\rm{LN}(\mu,\sigma^2)$ then

$ 1 = \rm{E}[X] = \exp\{\mu + \sigma^2/2\}\ \ \ \ <=>\ \ \ \mu = -\sigma^2/2. $

Parameter $\sigma$ means "tail fatness" and can be set arbitrarily high.

Related Question