[Math] Mathematically prove that a Beta prior distribution is conjugate to a Geometric likelihood function

bayesianbeta functionmaximum likelihood

I have to prove with a simple example and a plot how prior beta distribution is conjugate to the geometric likelihood function. I know the basic definition as

'In Bayesian probability theory, a class of distribution of prior distribution $\theta$ is said to be the conjugate to a class of likelihood function $f(x|\theta)$ if the resulting posterior distribution is of the same class as of $f(\theta)$.'

But I don't know how to prove it mathematically.

P.S. – It would really nice of you guys to provide some good material on bayesian statistic and probability theory.

Best Answer

Find $f(x|\theta)$. Using Bayes theorem we know that $f(x|\theta) = C f(\theta | x)f(\theta)$. $C$ is just a normalisation constant to make it integrate to $1$.

$f(\theta)$ is the PDF of the prior distribution. I.e. beta distribution (with some parameters $(\alpha, \beta)$). Here $f(\theta) = C' \theta^{\alpha-1}(1-\theta)^{\beta-1}$

$f(\theta |x)$ is the likelihood function for $\theta$ given that the data $x$ is distributed by a geometric distribution with parameter $\theta$. Our geometric likelihood function is $f(\theta | x) = \prod_{i=0}^n (1-\theta)^{x_i}\theta = (1-\theta)^{\sum_{i=0}^n x_i}\theta^n$.

Now were going to find the product of these and we expect it will have the same form as the beta prior but with new parameters $\alpha', \beta'$, and we will find the parameters.

So $f(\theta |x)f(\theta) = C' \theta^{\alpha+n-1}(1-\theta)^{\sum_{i=0}^n x_i+\beta-1}$. We can see the new parameters are $\alpha' = \alpha+n$, and $\beta' = \sum_{i=0}^n x_i +\beta$. Mission accomplished.