I know that the beta distribution is conjugate to the binomial. But what is the conjugate prior of the beta? Thank you.
Beta Distribution – Does It Have a Conjugate Prior?
beta distributionconjugate-prior
Related Solutions
To go from the third to fourth row just ignore factors that are constant with respect to $\pi$. That is, let
$$\binom{n}{y}\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}=k$$
so
$$p(\pi|y)=k \cdot \pi^y (1-\pi)^{(n-y)}\pi^{(\alpha-1)}(1-\pi)^{(\beta-1)}$$
or
$$p(\pi|y) \propto \pi^y(1-\pi)^{(n-y)}\pi^{(\alpha-1)}(1-\pi)^{(\beta-1)}$$
The idea's to deal just with the kernel of the probability distribution, knowing you can always put the normalizing constant back in later.
The problem with the Wikipedia article and the reference therein (Fink D., 1997) is that there is some key information missing.
Specifically, the given posterior is for $M-x$ (i.e. the number of target individuals in the population shifted by the number observed in the sample), not for $M$. Furthermore, the posterior parameter corresponding to the number of observations is missing and should be $N-n$ (i.e. the population size minus the sample size). These two corrections fixes the support problem that you correctly noticed, as shown below.
Suppose that $0 \leq X \leq n$ is the number of target individuals in a sample of size $n$ from a population of size $N$ with $0 \leq M \leq N$ total target individuals.
Then, $X \sim \text{HG}(n, M, N)$ with support in $[\max(0, n-N+M), \min(n, M)]$.
If $M \sim \text{BB}(N, \alpha, \beta)$ is the prior distribution of $M$, the posterior distribution for $M - x$ is also Beta-Binomial-distributed: $$M - x\,|\,x,\alpha,\beta \sim \text{BB}(N-n, \alpha + x, \beta + n - x)$$
If you write the probability mass function for $M$ you will find @Tim's answer above.
As an illustration, for $N = 20$ and $n = 10$, let's assume a non-informative prior distribution for $M$ with $M \sim \text{BB}(N, .5, .5)$. Suppose that we observe $x = 9$.
library(extraDistr)
library(tidyverse)
N = 20
n = 10
a0 <- b0 <- .5
x <- 9
data.frame(
m = 0:N
) %>%
mutate(
prior = dbbinom(m, size = N, alpha = a0, beta = b0),
post = dbbinom(m-x, size = N-n, a0+x, b0+n-x)
) %>%
gather(key, dens, -m) %>%
ggplot(aes(m, dens, col = key)) +
geom_line() +
geom_point()
Created on 2018-10-10 by the reprex package (v0.2.1)
Note that the posterior support is correctly [x, N − n + x].
Dyer, D. and Pierce, R.L. (1993). On the Choice of the Prior Distribution in Hypergeometric Sampling. Communications in Statistics - Theory and Methods, 22(8), 2125-2146.
Best Answer
It seems that you already gave up on conjugacy. Just for the record, one thing that I've seen people doing (but don't remember exactly where, sorry) is a reparameterization like this. If $X_1,\dots,X_n$ are conditionally iid, given $\alpha,\beta$, such that $X_i\mid\alpha,\beta\sim\mathrm{Beta}(\alpha,\beta)$, remember that $$ \mathbb{E}[X_i\mid\alpha,\beta]=\frac{\alpha}{\alpha+\beta} =: \mu $$ and $$ \mathbb{Var}[X_i\mid\alpha,\beta] = \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} =: \sigma^2 \, . $$ Hence, you may reparameterize the likelihood in terms of $\mu$ and $\sigma^2$ and use as a prior $$ \sigma^2\mid\mu \sim \mathrm{U}[0,\mu(1-\mu)] \qquad \qquad \mu\sim\mathrm{U}[0,1] \, . $$ Now you're ready to compute the posterior and explore it by your favorite computational method.