Probability density functions and cumulative distribution functions: open or closed intervals

probabilityprobability distributionsprobability theorystatistics

In Statistics, the probability density function, $f_X(x)$ and the cumulative distribution function, $F_X(x)$, of a real-valued random variable $X$ are said to have the following meanings:

$$f_X(x)=\Pr[X=x] \space\space;\space\space F_X(x)=\Pr[X\leq x]$$

And they are related by

$$f_{X}(x)=\frac{d}{d x} F_{X}(x) \space\space;\space\space F_X(x)=\int_{-\infty}^{x} f_{X}(u) du$$

However, I have found in various sources different criteria for including or not including the values of the endpoints of the interval. When do we consider a closed interval and an open interval? Which of the following options would be the correct one?

  1. $F_X(x)$ equals $\Pr[X < x]$ or $\Pr[X\leq x]$?

  2. $F_X(b)-F_X(a)$ equals $\Pr[a<X< b]$, $\Pr[a\leq X< b]$, $\Pr[a< X \leq b]$ or $\Pr[a \leq X \leq b]$?

  3. $f(x)dx$ equals $\Pr\big[X\in(x,x+dx)\big]$, $\Pr\big[X\in (x,x+dx]\big]$, $\Pr\big[X\in[x,x+dx)\big]$ or $\Pr\big[X\in[x,x+dx]\big]$?

Best Answer

In short : You have said that $f$ is a density i.e it relates to a continuous distribution so whether you include the end points or not it wont matter.

Some more :

Firstly if your distribution is continuous it doesn't matter since it will take any one discrete value with probability $0$.

However it is standard to include the interval (and matters if your distribution is discrete or mixed), i.e $F_X(x)=P(X\leq x)$ - which answers your first question. In fact this also answers your second question since it implies $F_X(b)-F_X(a)=P(a<X\leq b)$.

For your third question since $f$ is written as a density it does not matter. This is similar to taking a Riemann integral of some function over $[a,b]$ or $(a,b)$ its the same.