Connection between exponential distribution and uniform distribution

exponential distributionuniform distribution

I have two random variables (RV). For RV1, I generate random numbers from the uniform distribution $U(0,1)$, then keep those above a certain treshold $y$, and define the RV as the difference $x$ between those $y$ values, i.e. how long it took to generate a number above the treshold.

enter image description here

In the picture, the RV is thus the distance on the x-axis between the blue points above the red line, ignoring the points below the redline.

My second RV, RV2, is simply the exponential distribution, with $\lambda$.

I experimented, and when I e.g. chose the treshold $y=0.9$ and $\lambda=9$ for the exponential distriubtion then the resulting distributions look very similar. My questions is, are they connected, and if yes, how?

enter image description here

Orange is the exponential (RV2, 100000 samples), Blue is the transformed treshold uniform distribution (RV1, 100000 samples).

In particular, I am am simulating some process, and would like to generate arrival times that are exponentially distributed, and i am wondering if I can just sample from a uniform distribution at every iteration and when the value is above a treshold I define the trial to be a succes, otherwhise not. I am interested whether that process would have exponential interarrival times.

Also maybe the geometric distributino would be better suited but I am not sure.

Code for reproduction

N = 100_000
x = np.arange(N)
y = np.random.uniform(0, 1, N)
treshold = 0.9

plt.plot(x,y, alpha=0.5, marker=".", linestyle="")
plt.hlines(treshold, 0, N, color="red")
plt.show()

above_treshold = x[y > treshold]
times_until_above_treshold = np.diff(above_treshold)

plt.hist(times_until_above_treshold, alpha=0.4, bins=np.arange(0, 50))

y2 = np.random.exponential(scale=9, size=len(above_treshold))

plt.hist(y2, alpha=0.4, bins=np.arange(0, 50))

y2 = np.random.geometric(p=0.1, size=len(above_treshold))

plt.hist(y2, alpha=0.4, bins=np.arange(0, 50))

plt.yscale("log")
plt.show()
```

Best Answer

Well I solved the question while typing.

But in short, my experiment with sampling numbers and accepting if above a certain treshold is the geometric distribution.

The geometric is the discrete version of the exponential distribution however I dont feel confident explaining exactly how they are connected, though there are many ressources on that.

They indeed look very similar in this picture (100_000 realisations)

enter image description here

Blue is my experiment, Orange is geometric, with $p=1-y$.

Related Question