If task start time is chosen from a uniform random distribution, is inter-arrival time exponentially distributed

markov chainsprobabilityprobability distributionsqueueing-theorystatistics

Question: If the start time is generated from uniform random distribution, is the inter arrival time exponentially distributed?

Context:
We have a number of tasks to be run every day (and we have the flexibility to set the actual start time). To avoid overloading the system by running everything at a fixed time, and we want to spread the load through out the day. And everyday we want to run the task at the same time.

To spread the load, we hash the taskid to generate the start time. (Almost equivalent to generating from uniform random distribution)

The question is, since the arrival time is uniformly distributed and independent of each other, does the inter arrival time follow exponential distribution? I want to use queueing theory to model the system performance, since exponentially distributed inter arrival time is the most studied distribution, I want to see if I can use M/M/1 queuing model.

I understand there are some differences. The start time generated is discrete, but the exponential distribution is continuous. I assume, I can increase the granularity of the time to approximate continuous time.


Another option is to generate the start time as a Poisson process (At every t, decide whether to run or not with probability p. ) But at some point, we want to ensure the task runs every day, so we might have to force it to run at midnight eventually. So this might get a bit weird.

Best Answer

If the day is of length $1$, and there are $n$ independent uniformly distributed start times then this is equivalent to breaking a stick into $n+1$ pieces and there is probably a duplicate question on this site.

The inter-arrival times are not independent but they are identically distributed (as is the time before the first arrival and the time after the last, so $n+1$ gaps overall). Jointly they have a uniform Dirichlet distribution.

Each gap actually has a $\text{Beta}(1,n)$ distribution so with CDF $1-(1-x)^n$ and density $n(1-x)^{n-1}$, with mean $\frac1{n+1}$ and variance $\frac{n}{(n+1)^2(n+2)}$. I have seen this called a "reverse power function distribution".

If the day is not length $1$ but $d$ then stretch these results so the CDF of the inter-arrival times is $1-\left(1-\frac xd\right)^{n}$ and density $\frac nd\left(1-\frac xd\right)^{n-1}$ and mean $\frac d{n+1}$ and variance $\frac{n d^2}{(n+1)^2(n+2)}$. For large $n$ this is close to an exponential distribution with rate $\frac{n+1}{d}$, as you might guess.

For example, here are the densities using R when $d=1$ and $n=100$, with the Beta distribution in red and exponential distribution in blue. They are very close to each other, and further right would be very close to $0$ (there is a slight distinction beyond $1$ as the Beta density would be exactly $0$ while the exponential density would be positive but less than $10^{−40}$). For larger n they would be even closer

n <- 100
curve(dbeta(x,1,n), from=0, to=6/n, col="red")
curve(dexp(x, n+1), from=0, to=6/n, add=TRUE, col="blue")  

enter image description here