[Math] Confusion about Notation for Bayesian Statistics

bayesiannotationstatistics

I'm currently trying to learn Bayesian Statistics but I keep losing time trying to figure out what exactly is meant by notation. Could someone answer the following for me?

Let's say $X \sim N(\mu,\sigma^2)$

(1) I'm trying to calculate a posterior distribution $p(\mu\mid X) \propto p(X\mid\mu)p(\mu)$. So my understanding is that $p(\mu\mid X)$ is the probability distribution of the parameter $\mu$ given the data $X$. What then does $p(\mu\mid X,\sigma^2)$ mean exactly? My guess is that it is the probability distribution of the parameter $\mu$ given the data $X$ and assuming that $\sigma^2$ is fixed. Is that correct?

(2) Following up from (1), for $p(\mu\mid X,\sigma^2)$, what does the posterior function transform into? Is it $p(\mu\mid X,\sigma^2) \propto p(X\mid\mu,\sigma^2)p(\mu,\sigma^2)$? If it is different, how does the likelihood function really change? My understanding is that the likelihood function is based on the way the data are distributed and not the parameters conditioned on. Is there a difference.

(3) Similar question regarding the prior. If we are told that $p(\mu) \propto 1$, is there a difference between $p(\mu)$ and $p(\mu,\sigma^2)$?

Any clarification would be greatly appreciated!

Best Answer

Firstly, one should distinguish between the capital $X$ used to denote the random variable and the lower-case $x$ used as the argument to the density function or the cumulative distribution function, etc. This makes it possible to understand an expression like $\Pr(X\le x)=F(x)$ (with lower-case $x$ in two places and capital $X$ in one). Then $x\mapsto f_X(x)$ is the density function, and we can understand the difference between $f_X(3)$ and $f_Y(3)$ because they are densities of two different random variables.

If we write $\mu$ as a random variable (or, to be more "Bayesian", an uncertain quantity, to which a a probability distribution is assigned) and $m$ as the argument to its density function, then one can write, for example, $$ \Pr( 2 < \mu < 5) = \int_2^5 f_\mu(m)\, dm. $$

In any case, the notation $f_X( x \mid X,\mu,\sigma)$ means the conditional probability density function function of $X$ given the values of $X$, $\mu$, and $\sigma$, regardless of whether those are known values or uncertain values to which probability distributinos are assigned.

Related Question