Solved – Different notation for Bayes’ prior and posterior distributions

bayesiannotationposteriorprior

Bayes' rule is given by:

$$P(\theta|X) = \frac{P(X|\theta)P(\theta)}{P(X)}$$

Where $X$ are observations and $\theta$ is some model parameter. I would like to use an alternate notation to more strongly differentiate between the prior $P(\theta)$ and posterior $P(\theta|X)$ distributions. Is it appropriate to write:

$$P(\theta_\text{post}|X) = \frac{P(X|\theta_\text{prior})P(\theta_\text{prior})}{P(X)}$$

Can it be said that the posterior and prior describe the distributions of two different random variables namely $\theta_\text{post}$ and $\theta_\text{prior}$ respectively? Or are the prior and posterior different distributions of the same random variable $\theta$? So perhaps we should write:

$$P_\text{post}(\theta|X) = \frac{P(X|\theta)P_\text{prior}(\theta)}{P(X)}$$

Equally how should one denote the prior distribution?

$$\theta_\text{prior} \sim N(0,1)$$

Or:

$$P_\text{prior}(\theta) \sim N(0,1)$$

Best Answer

If you want to distinguish them, you can use subscripts on the probability mass (or density) functions directly (as you have done in your second example). For simplicity, this is usually written using conditional notation, i.e. $$ P_{\theta|X}(\theta|X) = \frac{P_{X|\theta}(X|\theta)P_\theta(\theta)}{P_X(X)} $$ This has the advantage that you can distinguish $P_X(1)$ vs $P_\theta(2)$ and $P_{X|\theta}(1|2)$ vs $P_{\theta|X}(2|1)$. But if this distinction is unnecessary, then the notation just appears repetitive because the subscript is identical to the content between the parentheses. Thus, the subscripts are typically dropped.

To the second question $$ \theta \sim N(0,1)$$ is appropriate because the random variable $\theta$ has a standard normal distribution. Rather than the probability density function for the random variable $\theta$ having a standard normal distribution.

Related Question