Set the mode and median of a gamma distribution equal to each other

gamma distributiongamma function

I am trying to generate a set of random positive steps that will result in a final location that is close to what I would have gotten from taking a similar number of fixed steps$^1$. I would like the step size to be sampled from gamma distribution$^2$ that has a peak at the fixed step size. If I understand correctly, this means that I want a gamma distribution whose mode is equal to the median.

$$mode = (k − 1) \theta$$
$$median = \theta \gamma^{-1}(k, \frac{\Gamma(k)}{2})$$

Setting these equal to each other and taking $\gamma$ of both sides yields

$$2 \gamma(k, k – 1) = \Gamma(k)$$

Which can be reworked to

$$\gamma(k, k – 1) = \Gamma(k, k – 1)$$

Where $\gamma$ is the lower incomplete gamma, and $\Gamma$ is the upper incomplete gamma function.

There is no closed form solution for $k$ that I am aware of to this problem. Having played with the integrals for a bit, I am unsure how to proceed to find the shape parameter of my distribution that will ensure that the mode and median coincide (getting the scale $\theta$ from the shape is trivial of course).

I am perfectly OK with a numerical solution, but I am not sure how to obtain even that, using commonly defined functions$^3$.

In short, how do I find the shape parameter of a gamma function whose median is equal to its mode?


Related questions, that helped me visualize up to here:


$^1$ I am doing this in Python + numpy, so I want a vector of random values x of size n such that np.cumsum(x) represents a sequence of motonically increasing x-values randomly disturbed from approximately np.linspace(x[0], x[-1], n).

$^2$ Implemented in np.random.gamma.

$^3$ In particular, scipy defines the regularized and non-regularized, complete and incomplete gammas and their inverses.

Best Answer

This is in response to your original modeling problem for steps with gamma distributed incremental distances. (I don't understand why you need the mean and mode of the gamma distributed steps to be the same.)

For example, we can use incremental distances $D_i$ identically and independently distributed as $\mathsf{Gamma}(shape=3, rate=3).$ Then $E(D_i) = 1, Var(D_i) = 1/3.$

Look at 50 such steps in time, starting at position $0$ at stage $0.$ At the $i$th stage we move to the right by $D_i.$ Then total distances after $n$ stages are $X_n = \sum_{i=1}^n D_i.$ Below we use R statistical software to simulate 50 stages.

set.seed(2018)       # use this statement to repeat EXACTLY this same simulation
d = rgamma(50, 3, 3) # incremental distances at each stage
x = c(0, cumsum(d))  # successive positions 
head(x)              # first six positions
## 0.0000000 0.6253108 1.0705457 2.0527608 4.0516030 4.7512606
tail(x)              # last six positions
## 43.76088 44.59100 45.67443 46.93781 47.70384 48.97505

The incremental distances average about 1 unit each. After 50 stages the total distance is $48.98 \approx 50$ units. Also, $E(X_{50}) = 50,\, Var(X_{50}) = 50/3,\, SD(X_{50}) = 4.0825,$ and by the Central Limit Theorem $X_{50} \stackrel{aprx}{\sim} \mathsf{Norm}(50, 4.0825),$ so that roughly 95% of the total distances after 50 stages will be within $50 \pm 8.0.$

Here is a plot with step numbers on the horizontal axis and total distance on the vertical axis. Some of the incremental distances (vertical jumps) are less than 1 and some are greater than 1, but their average is about 1.

plot(x, type="s", xlab="Step")

enter image description here

Related Question