Multinomial Distribution vs. Dirichlet Distribution – Comparing Multinomial and Dirichlet Distributions

dirichlet distributionmultinomial-distribution

Both Dirichlet and multinomial distributions are distributions over vectors, and both Dirichlet and multinomial distributions are constrained so that all of the elements of these vectors sum to a constant value.

Can somebody explain in simple words (and maybe with an example)in-detailed differences between Dirichlet and multinomial distributions?

Does Dirichlet distribution serves the same purpose as a multinomial distribution?

What are the advantages/disadvantages of using Dirichlet over multinomial distributions?

What makes the Dirichlet distribution different from a multinomial distribution?

Best Answer

Multinomial distribution is a discrete, multivariate distribution for $k$ variables $x_1,x_2,\dots,x_k$ where each $x_i \in \{0,1,\dots,n\}$ and $\sum_{i=1}^k x_i = n$. Dirichlet distribution is a continuous, multivariate distribution for $k$ variables $x_1,x_2,\dots,x_k$ where each $x_i \in (0,1)$ and $\sum_{i=1}^k x_i = 1$. In the first case, the support of the distribution is limited to a finite number of values, while in the second case, to the infinite number of values that fall into the unit interval are within the support.

Does Dirichlet distribution serves the same purpose as a multinomial distribution?

No. Multinomial is a distribution for counts, while Dirichlet is usually used as a distribution over probabilities.

What are the advantages/disadvantages of using Dirichlet over multinomial distributions?

They are different things, and as you can learn from the Can a Multinomial(1/n, ..., 1/n) be characterized as a discretized Dirichlet(1, .., 1)? thread, they behave differently in higher dimensions. You would almost never use them exchangeably.

The exception is that in some cases, you might want to use a continuous distribution to approximate the discrete distribution, e.g. as you can approximate binomial (for large $n$), or Poisson distribution (for large $\lambda$) with Gaussian.

What makes the Dirichlet distribution different from a multinomial distribution?

They are continuous vs discrete distributions.