Solved – Reparameterization trick for gamma distribution

autoencodersvariational-bayes

I am reading the work of Welling on Vartiational Auto-Encoders (VAE), and wonder if there is any way to generate Gamma distributed samples via a similar reparametrization?

The idea of reparameterization is quite simple. To generate a sample from a particular distribution, i.e. $z\sim N(\mu,\sigma)$, we generate samples from another distribution, i.e. $\epsilon\sim N(0,1)$, and then transform it into genuine samples from the desired distribution, i.e. via $z=\mu+\sigma\times\epsilon$. This has some nice consequences in the VAE framework, i.e., reduces the variance of the gradient (not really important here).

I basically wonder if such transformation exists for Gamma distribution.

Best Answer

The answer is "yes" in one sense and "no" in another sense.

Suppose $X \sim \operatorname{Gamma}(\alpha,\beta)$. Let $F_{\alpha,\beta}$ denote the cdf of a Gamma distribution. Then define $\epsilon = \Phi^{-1}[F_{\alpha,\beta}(X)]$. Then if you simulate $\epsilon \sim N(0,1)$ you can get the relevant gamma distribution by setting $X = F^{-1}_{\alpha,\beta}[\Phi(\epsilon)]$. This is a consequence of the probability integral transform. Additionally, the transform $T(\epsilon ; \alpha, \beta) = F^{-1}_{\alpha,\beta}[\Phi(\epsilon)]$ is differentiable. So you could use this idea with the reparametrization trick, at least in principle, to improve your stochastic variational inference. This implies that, in a liberal sense, the answer is "yes, there is a reparameterization trick", and in fact there is one for essentially any family of continuous distributions. If this seems sort of ad-hoc, notice that if you apply this trick with the Gaussian family in place of the gamma, you get back exactly the usual reparameterization trick.

In a more restrictive sense, I would say the answer is "no". The function $F^{-1}$ above is not available in closed form, so things are not so convenient to the point where we might disqualify this approach. Alternatively, there is no reason to restrict ourselves to $\epsilon \sim N(0,1)$, and we might just ask for $\epsilon \sim Q$ for some standard distribution $Q$ that is easy to sample from, such that $T(\epsilon; \alpha, \beta) \sim \text{Gamma}(\alpha,\beta)$ where $T$ is also easy to compute.

If you find such a transformation $T$ and standard distribution $Q$, let me know because I would be interested in it. The main problem is the shape parameter $\alpha$. If I know $\alpha$, then I can take $T(\epsilon; \alpha, \beta) = \epsilon \beta$ and set $\epsilon \sim \mbox{Gamma}(\alpha,1)$, because the gamma family with know $\alpha$ is a scale family. The shape parameter does not have any nice algebraic properties as far as I know, aside from the fact that $X_1 + X_2 \sim \mbox{Gamma}(\alpha_1 + \alpha_2, 1)$ provided that $X_i \sim \mbox{Gamma}(\alpha_i, 1)$ and they are independent. It's not clear how to take advantage of this. A negative result for us is that, if such a convenient $T$ existed, then R would probably use that to sample from the a generic gamma distribution, but instead it uses rejection sampling.

Related Question