The gamma distribution essentially tells us the probability of $k$ events happening in a given amount of time, $t$.
It seems to me that there are certain examples of the gamma distribution where it behaves memoryless. For example, the probability of 2 customers entering a store in 3 hours GIVEN that no customer has entered in the first hour is equivalent to the probability of 2 customers entering a store in 2 hours.
Is my example properly demonstrating "memoryless-ness"? What would be an example of it having memory using my store scenario?
Best Answer
A continuous random variable $T$ has the "memoryless" property if $$\Pr[T > t+x \mid T > x] = \Pr[T > t].$$ So for instance, if $T$ is a service time, then given that one has waited for more than $x$ units, the amount of additional time to wait does not depend on how much time one has already waited.
The gamma distribution for a positive integer shape parameter $n$ is also known as the Erlang distribution, and models the total amount of time needed to wait to observe $n$ events, where events have independent and identically distributed increments; that is to say, the interarrival time between each event is exponentially distributed. The corresponding stochastic process is what we call a (homogeneous) Poisson point process.
That said, when $n = 1$, the service time is obviously memoryless, since it is exponentially distributed.
When $n > 1$, however, the service time is not memoryless. To understand why, consider an example where, say, $n = 100$. Then $T$ is the total time it takes to observe $100$ events. If the event rate is low, you would expect to wait quite a long while; but if $x$ is sufficiently large (e.g., larger than the expectation of $T$), then chances are you have already seen many of the $100$ necessary events, and you do not have to wait much longer to see the remaining events; clearly, this is not the same as starting over.
Another way to think of it is that you're in a (very) long line at the grocery store. There are $99$ people ahead of you. Each person takes some random exponentially distributed amount of time to check out; suppose on average it is $\lambda = 1$ minute. The total time you have to wait is gamma (Erlang) with shape $n = 100$ and $\lambda = 1$. Then if you have waited already $x = 120$ minutes, the probability that you have to wait at least another $t = 10$ minutes is $$\Pr[T > 130 \mid T > 120] \approx 0.0987092,$$ but $$\Pr[T > 10] \approx 1.$$ That's because in the conditional probability case, you've already waited $120$ minutes and are likely to have seen nearly all of the people in front of you get checked out; whereas $\Pr[T > 10]$ is almost certainly $1$ because in order for $T \le 10$, all $99$ people in front of you have to get checked out in under $10$ minutes.
The issue with your example is in the statement in boldface:
The given condition, that no customer has entered in the first hour, is not the same as saying that the waiting time is over 1 hour. In particular, the event that no customers have arrived in the first hour is a proper subset of the event that the waiting time is over 1 hour, because if only one customer has arrived in the first hour, you still haven't met the stopping condition.
Mathematically, your statement is $$\Pr[T_2 \le 3 \mid X(1) = 0] = \Pr[T_2 \le 2],$$ where $T_n$ represents the total waiting time to observe $n$ customers arriving, and $X(t)$ represents the number of customers arriving up to time $t$. And while your statement is correct, it's not how memorylessness is defined. For $T_2$ to be memoryless, it needs to satisfy $$\Pr[T_2 \le 3 \mid T_2 > 1] = \Pr[T_2 \le 2].$$ That is to say,
And this is obviously false for the reason I described above: because exactly one customer could have arrived within the first hour, meaning that only one more customer is needed within the next two hours to meet the stopping condition; whereas the right-hand side probability means you have to wait for two more customers to arrive within two hours.