Solved – Different notation for Bayes’ prior and posterior distributions

bayesiannotationposteriorprior

Bayes' rule is given by:

$$P(\theta|X) = \frac{P(X|\theta)P(\theta)}{P(X)}$$

Where $X$ are observations and $\theta$ is some model parameter. I would like to use an alternate notation to more strongly differentiate between the prior $P(\theta)$ and posterior $P(\theta|X)$ distributions. Is it appropriate to write:

$$P(\theta_\text{post}|X) = \frac{P(X|\theta_\text{prior})P(\theta_\text{prior})}{P(X)}$$

Can it be said that the posterior and prior describe the distributions of two different random variables namely $\theta_\text{post}$ and $\theta_\text{prior}$ respectively? Or are the prior and posterior different distributions of the same random variable $\theta$? So perhaps we should write:

$$P_\text{post}(\theta|X) = \frac{P(X|\theta)P_\text{prior}(\theta)}{P(X)}$$

Equally how should one denote the prior distribution?

$$\theta_\text{prior} \sim N(0,1)$$

Or:

$$P_\text{prior}(\theta) \sim N(0,1)$$

Best Answer

If you want to distinguish them, you can use subscripts on the probability mass (or density) functions directly (as you have done in your second example). For simplicity, this is usually written using conditional notation, i.e. $$ P_{\theta|X}(\theta|X) = \frac{P_{X|\theta}(X|\theta)P_\theta(\theta)}{P_X(X)} $$ This has the advantage that you can distinguish $P_X(1)$ vs $P_\theta(2)$ and $P_{X|\theta}(1|2)$ vs $P_{\theta|X}(2|1)$. But if this distinction is unnecessary, then the notation just appears repetitive because the subscript is identical to the content between the parentheses. Thus, the subscripts are typically dropped.

To the second question $$ \theta \sim N(0,1)$$ is appropriate because the random variable $\theta$ has a standard normal distribution. Rather than the probability density function for the random variable $\theta$ having a standard normal distribution.

Related Solutions

Solved – Posterior very different to prior and likelihood

Yes this situation can arise and is a feature of your modeling assumptions specifically normality in the prior and sampling model (likelihood). If instead you had chosen a Cauchy distribution for your prior, the posterior would look much different.

prior = function(x) dcauchy(x, 1.5, 0.4)
like = function(x) dnorm(x,6.1,.4)

# Posterior
propto = function(x) prior(x)*like(x)
d = integrate(propto, -Inf, Inf)
post = function(x) propto(x)/d$value

# Plot
par(mar=c(0,0,0,0)+.1, lwd=2)
curve(like, 0, 8, col="red", axes=F, frame=T)
curve(prior, add=TRUE, col="blue")
curve(post, add=TRUE, col="seagreen")
legend("bottomleft", c("Prior","Likelihood","Posterior"), col=c("blue","red","seagreen"), lty=1, bg="white")

Cauchy prior, normal sampling model

Best Answer

Related Solutions

Solved – Posterior very different to prior and likelihood

Related Question