Solved – Bayes Estimation on Dirichlet distribution

dirichlet distributionmultinomial-distributionprobability

I'm trying to get my head around the "hidden species" problem. It goes something like this. You visit a park and run into three species, 3 lions, 2 tigers and 1 bear. You are to determine what is your best guess on the distribution of these species. You may be tempted to do $\frac{3}{3+2+1} = 50\%$ for the lions and 33% for the tigers 17% for the bears, but this is wrong. Instead the dirichlet distribution is used to solve this. Described here. In short, what the author says is to use $\alpha_{i} = \{1,1,1\}$ and arrive at
$P(Lion) = \frac{1 + 3}{1+1+1 + 3 + 2 + 1} = \frac{4}{9}$. The $\{1,1,1\}$ in the denominator is the sum of the $\alpha_i's$

Specifically my questions are the following.

1) While I get the fact that the dirichlet distribution is the conjugate prior multinomial distribution, how did the author choose the alpha's in the dirichlet distribution to be (1,1,1)?

2) Since the choice of alpha's here are like "imaginary" counts, how can I accomodate the info (if it exists) that I'd seen 4 lions, 1 tiger & 2 bears (say) in the past.

Could someone shed some more explanatory light on this?
Thanks much

Best Answer

In answer to your specific questions:

  1. The initial choice of prior was arbitrary. It was probably an attempt to produce an uniformative prior while avoiding an improper prior.

  2. Because a Dirichlet distribution is a conjugate prior with the property that to "update the Dirichlet distribution, we add the number of observations to each parameter", you just add the observations. So an initial prior with parameters $(1,1,1)$ and earlier observations $(4,1,2)$ gives a revised prior with parameters $(5,2,3)$. Combined with the new observations $(3,2,1)$ gives a posterior distribution with parameters $(8,4,4)$. You might use this as a new prior for subsequent visits to the park.