[Math] Confusion between probability distribution function and probability density function

probability

As far as I am concerned, probability distribution function is for discrete random variables while probability density function is for continuous random variables. To find the probability value of continuous random variable, we have to take the total area under the function which differ from discrete random variable, where we can take the probability value directly from the function.

But here's my confusion,

Let's say Z=X+Y. X is discrete and can take in value of -1 and 1. Y is Gaussian random variable. So,Z is Gaussian random variable with probability density function of p(Z)=p(Z|X=1)+p(Z|X=-1).

However when we try to find p(X=1|Z), it is equals to [p(Z|X=1)p(X=1)]/p(Z).
My questions, how can we time p(Z|X=1) and p(X=1) since former is probability density function while the latter is probability distribution function? What's more, p(Z) is also a probability density function. In the end, what is p(X=1|Z), a probability density or distribution function?

Best Answer

the real complication here is the joint probability distribution of $X$ and $Y$. In introductory courses, one learns about the joint pdf of two continuous random variables, or the joint pmf of two discrete random variables. The mixed case you mention, where one variable is continuous and the other is discrete is not commonly mentioned in introductory texts - although some manage to sneak it in without calling attention to just what they are dealing with.

A common example is this: suppose $X$ has the binomial distribution bin($n,u$), where $u$ is the success probability. Each day a new $X$ is observed. However, the value of $u$ changes from day to day in a random way: it is selected each time at random uniformly from the interval (0,1). Because the success probability $u$ is also random, we should denote this random quantity [before it is selected] by $U$. [In Bayesian contexts, this uniform distribution for $U$ is referred to as a 'prior pdf' for $U$. however, unlike the bayesian context, which focuses on $U$, our focus here is on $X$.]

so we have here a mixed pair of random variables $(X,U)$, where one is discrete and one is continuous [as in the OP's question.] here the [mixed] joint probability distribution of $(X,U)$ is specified by two distributions: the [marginal] pdf of $U$: $f_U(u) = 1$ for $0 < u < 1$ [and is $0$ elsewhere] - and what is really the conditional distribution of $X$ given $U$, for which the conditional pmf is

$$f_{X|U}(x|u) = P(X=x|U=u) = P(bin(n,u) = x) = {n\choose x} u^x(1-u)^{n-x},\quad x = 0,1,\cdots, n.$$

it turns out that the joint [mixed] pmf-pdf of $(X,U)$ can be written as the product of the marginal pdf of $U$ and the conditional pmf of $X|U$, analogous to what one does in the case where both variables are discrete, or both are continuous. [the justification for this involves what kahen alludes to in his answer - but it goes beyond the scope of this reply.]

so we can write the mixed joint pmf-pdf of $(X,U)$ as

$$ f_{X,U}(x,u) = f_U(u)f_{X|U}(x|u) = {n\choose x} u^x(1-u)^{n-x},\quad x = 0,1,\cdots, n,\quad 0<u<1.$$

[here $f_{X,U}(x,u)$ is understood to vanish for other values of $(x,u)$.]

this mixed pmf-pdf can be treated similarly to the way one manipulates a joint pmf or pdf [to get marginal and conditional distributions, for example] - except that one integrates out $u$ to get the marginal pmf of $X$ and sums out $x$ to get the marginal pdf of $U$.

so for example:

$$f_X(x) = \int_0^1 f_{X,U}(x,u)du =\int_0^1 {n\choose x} u^x(1-u)^{n-x}du = \frac{1}{n+1},\quad x = 0,1,\dots,n.$$

thus $X$ is marginally uniformly distributed on the set of outcomes $\{ 0,1,\dots, n\} $. [this means that over a large number of days, all of the possible values of $X$ (from $0$ to $n$) will turn up about equally often. given the way $U$ varies from day to day, this seems plausible.]

$\large \mathbf [$ the evaluation of the integral is straightforward on noting that it is a beta function integral, so that

$$\int_0^1 u^x(1-u)^{n-x}du = Beta(x+1, n-x+1) = \frac{x!(n-x)!}{(n+1)!}. {\mathbf ]}$$

one can similarly get the conditional pdf of $U|X$ as

$$f_{U|X}(u|x) = \frac{f_{X,U}(x,u)}{f_X(x)} = (n+1){n\choose x}u^x(1-u)^{n-x},\quad 0<u<1.$$

this means that looking only at the days when $X=x$, conditionally, $U$ has a beta distribution with parameters $(x+1, n-x+1)$. this result is what interests bayesians in their contexts - altho their scenario would usually involve considering the value of $X$ only for day 1, and their interpretation of $f_{U|X}(u|x)$ isn different from that given here.

the bottom line is that in working with a mixed pmf-pdf, each variable retains its discrete or continuous nature whether one considers marginal, joint or conditional distributions, and the joint mixed pmf-pdf can be manipulated as in the pure [= unmixed] cases, remembering to sum or integrate as is appropriate.

this reply does not address the specific problem posed in the OP, but it can be handled by the same methods. [only the imputs are a bit different: there one starts with [the joint pmf-pdf of] $(X,Y)$ and proceeds to $(X,Z)$. starting from the mixed pmf-pdf of $(X,Y)$, it is straightforward to get the conditional pmf of $Z|X$ and then the joint mixed pmf-pdf of $(X,Z)$.]

Related Question