Solved – Bayesian estimation of Dirichlet distribution parameters

bayesiandirichlet distributionestimationgibbs

I want to estimate parameters of Dirichlet mixture models using Gibbs sampling and I have some questions about that:

  1. Is a mixture of Dirichlet distributions equivalent to a Dirichlet process? What is their main differences if is not?

  2. Also, if I want to estimate a single Dirichlet distribution's parameters, which distribution for parameters should be selected as priors in Bayesian framework?

In all of the papers I found an estimation of a multinomial distribution using Dirichlet priors. I need estimation of a Dirichlet distribution using multinomial priors, perhaps.

Is the posterior function also in the form of DIRICHLET(α+N) similar to the case “estimation of multinomial distribution using Dirichlet priors”? as the multiplication of probability density function for iid samples are not considered in the definition of the likelihood function. I again cannot understand why.

e.g. as expressed in this paper:
http://www.stat.ufl.edu/~aa/cda/bayes.pdf
or
http://research.microsoft.com/en-us/um/people/minka/papers/minka-multinomial.pdf


so thanks for your attention

my data is Hyperion (a kind of hyperspectral remote sensing imagery) and i want to perform hyperspectral unmixing using mixture of Dirichlet sources and i will apply Gibbs sampling method for parameter estimation. my data is in dimension (614*512*224) which is commonly available AVIRIS sensor data for Cuprite Nevada district and is almost 200MB. also this data is available via (http://aviris.jpl.nasa.gov/data/free_data.html). unfortunately i don't know how can i sent my data.

i just ask you to help me in statistical modelling tasks for my PHD thesis.
i will be so grateful if you help me to solve my confusions in modelling.

all the best
solmaz

Best Answer

To calculate the density of any conjugate prior see here.

However, you don't need to evaluate the conjugate prior of the Dirichlet in order to perform Bayesian estimation of its parameters. Just average the sufficient statistics of all the samples, which are the vectors of log-probabilities of the components of your observed categorical distribution parameters. This average sufficient statistic are the expectation parameters of the maximum likelihood Dirichlet fitting the data $(\chi_i)_{i=1}^n$. To go from expectation parameters to source parameters, say $(\alpha_i)_{i=1}^n$, you need to solve using numerical methods: \begin{align} \chi_i = \psi(\alpha_i) - \psi\left(\sum_j\alpha_j\right) \qquad \forall i \end{align} where $\psi$ is the digamma function.

To answer your first question, a mixture of Dirichlets is not Dirichlet because, for one thing, it can be multimodal.