[Math] How to plot multinomial beta from Dirichlet Distribution

math-softwareMATLABoctaveprobability distributions

I would like to plot the multinomial beta from the Dirichlet Distribution wiki page.

$$\mathrm{B}(\alpha) = \frac{\prod_{i=1}^K \Gamma(\alpha_i)}{\Gamma\bigl(\sum_{i=1}^K \alpha_i\bigr)},\qquad\alpha=(\alpha_1,\dots,\alpha_K)$$

Looking at the equation, the only input should be a vector of alphas of size $K$.

What Octave/Matlab function(s) should I use?
I found the following Matlab beta function, but it seems to be only for $K=2$.

Best Answer

You can use the gamma and gammaln in MATLAB to directly compute the expression you have. I would recommend gammaln since you will have very big numbers, and the logarithmic form avoids computing the ratio of big numbers.

Also, if you want to sample from Dirichlet, you can generate bunch of gamma random variables using gamrnd (in statistics toolbox) and then normalize.

Related Solutions

[Math] How to derive the Dirichlet-multinomial

There is confusion in the literature. There are at least two takes.

Categorical distribution

Often, the Dirichlet-multinomial is actually not a compound Dirichlet and a multinomial, but a compound Dirichlet and categorical distribution:

$$p(z|\theta) = \prod_i \theta_i^{z_i}$$

This means that this is about only one categorical variable, not a set. The notation of above would for dice assign the vector $[1,0,0,0,0,0]$ to the face with one pip, $[0,1,0,0,0,0]$ to the face with two pips, etc. Naturally, this means that $\sum_i z_i = 1$.

This gets rid off the $\frac{n!}{\prod_i z_i!}$ factor and leads to the much shorter:

$$p(z|\theta)p(\theta|\alpha) = \frac{1}{B(\alpha)}\prod_i \theta_i^{z_i+\alpha_i-1}$$

To subsequently derive at the Dirichlet-multinomial, you'll have to integrate over: $$ \int p(z|\theta) p(\theta|\alpha) d\theta = \frac{1}{B(\alpha)} \int \prod_i \theta_i^{z_i+\alpha_i-1} d\theta$$

Now, the Dirichlet didn't come from nowhere... The factor $B(\alpha)$ is a normalization factor:

$$p(\theta|\alpha) = \frac{1}{B(\alpha)} \prod_i \theta_i^{\alpha_i-1} = \frac{\prod_i \theta_i^{\alpha_i-1}}{\int_{\Delta^n} \prod_i \theta_i^{\alpha_i-1} d\theta}$$

with $\int_{\Delta^n}$ corresponding to the condition $\sum_i \theta_i = 1$. (Feel free to improve my notation.)

In other words, the multivariate Beta function is actually this integral directly from the definition:

$$\int_{\Delta^n} \prod_i \theta_i^{\alpha_i-1} d\theta = B(\alpha)$$

And hence the integral:

$$\int \prod_i \theta_i^{z_i+\alpha_i-1} d\theta = B(\alpha+z)$$

Hence: $$ \int p(z|\theta) p(\theta|\alpha) d\theta = \frac{B(\alpha+z)}{B(\alpha)}$$

Or to end up with something similar looking to the Wikipedia definition:

$$ \int p(z|\theta) p(\theta|\alpha) d\theta = \frac{\prod_i \Gamma(\alpha_i+z_i)}{\Gamma(\sum_i (\alpha_i + z_i))} \frac{\Gamma(\sum_i \alpha_i)}{\prod_i \Gamma(\alpha_i)}$$

Collecting terms:

$$ \int p(z|\theta) p(\theta|\alpha) d\theta = \frac{\Gamma(\sum_i \alpha_i)}{\Gamma(\sum_i \alpha_i + \sum_i z_i)} \prod_i \frac{\Gamma(\alpha_i + z_i)}{\Gamma(\alpha_i)}$$

Note, however that we run $i$ here over the entries in our categorical variable $z$ represented as a vector! This is very different from the Wikipedia definition!

Multinomial distribution

In case of an actual multinomial distribution, counts of $z$ - let's write them $n(z)$ - are actually the topic of consideration, not $z$ itself.

$$p(z|\theta) = \frac{(\sum_k n(z_k))!}{\prod_k (n(z_k)!)} \prod_k \theta_k^{n(z_k)}$$

We now run over $k$ unique variables, not over a vectorized categorical variable.

Of course, we can now again multiply with a Dirichlet distribution and the derivation is along the lines as described before. The result:

$$ \int p(z|\theta) p(\theta|\alpha) d\theta = \frac{(\sum_k n(z_k))!}{\prod_k (n(z_k)!)} \frac{\Gamma(\sum_k \alpha_k)}{\Gamma(\sum_k \alpha_k + \sum_k n(z_k))} \prod_k \frac{\Gamma(\alpha_k + n(z_k))}{\Gamma(\alpha_k)}$$

The term $\sum_k n(z_k)$ can be simplified to $N$. This is the actual Dirichlet-multinomial. It's not pretty, but this is what it is.

N categorical distributions

The third option, and this is meant at the Wikipedia page is the distribution of a sequence of categorical variables. Recall that the multinomial assigns probabilities to the number of extract balls (in an experiment getting n balls out of a bag with k ball types). A sequence of categorical variables assigns a probability to a sequence and has a form without the normalization factor:

$$p(z|\theta) = \prod_k \theta_k^{z_k}$$

Here $k$ runs over the categories. We can now follow the derivation as with the single categorical variable.

$$ \int p(z|\theta) p(\theta|\alpha) d\theta = \frac{\Gamma(\sum_k \alpha_k)}{\Gamma(\sum_k \alpha_k + \sum_k n(z_i=k))} \prod_k \frac{\Gamma(\alpha_k + n(z_i=k))}{\Gamma(\alpha_k)}$$

Note for example that we have $\alpha_k$; the parameter $\alpha$ is now considered the same for each cluster $k$.

[Math] Integrating Dirichlet Distribution

By manipulating the integral you can obtain the beta function. For ease of notation, let $1-k = x$, then $I$ can be written as \begin{align} I &= \int_{0}^{x} t^{\alpha-1}(x-t)^{\beta-1}dt \\ &= x^{\beta-1} \int_{0}^{x} t^{\alpha-1} \left(1-\frac{t}{x}\right)^{\beta-1}dt. \end{align} Let $t/x = u$ then $dt = x du$. If $t=x$ then $u=1$, hence the above integral reduces to $$\int_{0}^{1} x^{\alpha-1} u^{\alpha-1} (1-u)^{\beta-1} x du.$$ Therefore, \begin{align} I &= x^{\alpha+\beta-1} \int_{0}^{1} u^{\alpha-1}(1-u)^{\beta-1}du \\ &= x^{\alpha+\beta-1} \beta(\alpha,\beta). \end{align}

Best Answer

Related Solutions

[Math] How to derive the Dirichlet-multinomial

[Math] Integrating Dirichlet Distribution

Related Question