Solved – How to generate numbers according to a Soliton distribution

distributionspython

The Soliton distribution is a discrete probability distribution over a set $\{1,\dots, N\}$ with the probability mass function

$$
p(1)=\frac{1}{N},\qquad p(k)=\frac{1}{k(k-1)}\quad\text{for }k\in\{2,\dots, N\}
$$

I'd like to use it as part of an implementation of an LT code, ideally in Python where a uniform random number generator is available.

Best Answer

If we start at $k=2$, the sums telescope, giving $1-1/k$ for the (modified) CDF. Inverting this, and taking care of the special case $k=1$, gives the following algorithm (coded in R, I'm afraid, but you can take it as pseudocode for a Python implementation):

rsoliton <- function(n.values, n=2) {
  x <- runif(n.values)         # Uniform values in [0,1)
  i <- ceiling(1/x)            # Modified soliton distribution
  i[i > n] <- 1                # Convert extreme values to 1
  i
}

As an example of its use (and a test), let's draw $10^5$ values for $N=10$:

n.trials <- 10^5
i <- rsoliton(n.trials, n=10)
freq <- table(i) / n.trials  # Tabulate frequencies
plot(freq, type="h", lwd=6)

Frequency distribution