This is a fairly straight forward problem. Although there is a connection between the Poisson and Negative Binomial distributions, I actually think this is unhelpful for your specific question as it encourages people to think of negative binomial processes. Basically, you have a series of Poisson processes:
$$Y_i(t_i)|\lambda_i\sim Poisson(\lambda_i t_i)$$
Where $Y_i$ is the process and $t_i$ is the time you observe it, and $i$ denotes the individuals. And you are saying that these processes are "similar" by tying the rates together by a distribution:
$$\lambda_i\sim Gamma(\alpha,\beta)$$
On doing the integration/mxixing over $\lambda_i$, you have:
$$Y_i(t_i)|\alpha\beta\sim NegBin(\alpha,p_i)\;\;\; where \;\;p_i=\frac{t_i}{t_i+\beta}$$
This has a pmf of:
$$Pr(Y_i(t_i)=y_i|\alpha\beta) = \frac{\Gamma(\alpha+y_i)}{\Gamma(\alpha)y_i!}p_i^{y_i}(1-p_i)^\alpha$$
To get the waiting time distribution we note that:
$$Pr(T_i\leq t_i|\alpha\beta)=1-Pr(T_i> t_i|\alpha\beta)=1-Pr(Y_i(t_i)=0|\alpha\beta)$$
$$=1-(1-p_i)^\alpha=1-\left(1+\frac{t_i}{\beta}\right)^{-\alpha}$$
Differentiate this and you have the PDF:
$$p_{T_i}(t_i|\alpha\beta)=\frac{\alpha}{\beta}\left(1+\frac{t_i}{\beta}\right)^{-(\alpha+1)}$$
This is a member of the generalized Pareto distributions, type II. I would use this as your waiting time distribution.
To see the connection with the Poisson distribution, note that $\frac{\alpha}{\beta}=E(\lambda_i|\alpha\beta)$, so that if we set $\beta=\frac{\alpha}{\lambda}$ and then take the limit $\alpha\to\infty$ we get:
$$\lim_{\alpha\to\infty}\frac{\alpha}{\beta}\left(1+\frac{t_i}{\beta}\right)^{-(\alpha+1)}=\lim_{\alpha\to\infty}\lambda\left(1+\frac{\lambda t_i}{\alpha}\right)^{-(\alpha+1)}=\lambda\exp(-\lambda t_i)$$
This means that you can interpret $\frac{1}{\alpha}$ as an over-dispersion parameter.
Several stochastic processes lead to marginal counts having a Negative
Binomial (NB) distribution and can therefore be called NB processes.
Among them, the NB Lévy Process is of special interest since
increments (counts) over non-overlapping time intervals are
independent, a property shared with the Poisson Process, a Gamma
process and the Wiener Process. The count $N_t$ on an interval of
length $t$ has the NB distribution
$$
N_t \sim \textrm{NB}(r,\,p), \quad r = \gamma t
$$
so the process depends on the two parameters $\gamma >0$ (with the
dimension of an inverse time) and the probability $p$ ($0 < p < 1$).
The expectation is proportional to the interval length, and so is its
variance
$$
\mathbb{E}(N_t) = \gamma t \, (1-p)/p \qquad
\textrm{Var}(N_t) = \gamma t \, (1-p)/p^2.
$$
The variance is greater than the mean (overdispersion), and the
index of dispersion $\textrm{Var}(N_t)/\mathbb{E}(N_t) = 1/p$ does not
depend on $t$. When $p$ is close to $1$ and $\gamma (1-p)$
is close to $\lambda >0$, the process behaves like
a Poisson Process with rate $\lambda$.
An explanation for overdispersion is that several events can happen at
the same time, so a small interval can contain more than one event.
It is easy to fit such a process by Maximum Likelihood when the
intervals have different lengths. In this case we face a NB regression
with a link function differing from the default link in NB GLMs. A
special likelihood maximisation is useful.
The article by T.J. Kozubowski and K. Podgorski provide theoretical
results as well as an illustration.
Curiously enough, this process does not seem to be frequently used
as such by statisticians.
Best Answer
Let's see what Dan Ma actually says in his blog. To quote:
Aside from some small oddness in the wording, the gist of that is fine. The parameter of the Poisson ($\theta$ in the quoted discussion) represents the underlying rate of claims per unit time; that individuals are homogeneous, and have different 'riskiness' (different claim-rates) isn't controversial.
So why does he think that the distribution of the claim-rate is distributed as gamma?
Well, actually he doesn't say that he thinks that at all.
What he says is:
He's positing a circumstance -- discussing an assumption if you wish -- for which he then discusses the consequences.
He doesn't even assert anything about the plausibility of the assumption.
Here's some things that might be reasonable to assert/suppose about the claim-rate distribution:
1) It's necessarily non-negative and may be taken to be continuous
2) we could expect that it would tend to be right-skew
3) We might not-too-unreasonably expect there to be a typical level (a mode), around which the bulk of the distribution lies, and that it tails off as we move further away (i.e. it might be reasonable to expect that it would be unimodal, at least to a first approximation)
That's about all we could say without collecting data.
The gamma at least doesn't break any of those suppositions/expectations, and so is likely to result in a more useful distribution than assuming homogeneity of claim-rate, but any number of other distributions satisfy those conditions.
So why gamma rather than lognormal say? Likely, a matter of convenience; the gamma works nicely with the Poisson - which even conditional on the individual underlying claim-frequency is itself another assumption that isn't actually true (though we can make some argument that the assumptions of claims having a Poisson process may not be too badly wrong, it's clear that they can't be exactly true).
There's no good reason to think it is gamma-distributed.
Indeed, I'll assert here and now that there's no real-world case where the claim rate is actually gamma distributed, in practice there will always be differences between the actual distribution of interest and some simple model for it; but that's true of essentially all our probability models.
They're convenient fictions, which may sometimes be not so badly inaccurate as to have some value.
Nothing will tell you it is; in fact you can be quite sure - even when it looks like an excellent description of the distribution - that the gamma is at best merely an approximation. You can use diagnostic displays (perhaps something like a Q-Q plot) to help check that it's not too far from gamma.