Solved – The relationship between cumulative distribution vs cumulative density vs probability density

distributions

Can you please explain those three terms and the relationship between them (both graphical and mathematical way would be fine)?

EDIT:
Those terms are mainly associated with functions.The quotation from this article says:

The probability density function, pdf as f(x). (Note: This function is
also known as the probability distribution function and the
probability mass function, but will be referred to henceforth as the
probability density function.)

According to above:

Probability density function = Probability distribution function = Probability mass function

But wikipedia has separate articles about Probability Mass Function and Probability Density Function. And here is mentioned that the name Probability Distribution Function depends on the context in which is used. So where is the truth?

For completeness the term "cumulative density" is wrong as it was discussed bellow.

Best Answer

Looking at the links you provided, and what I'v seen in the past, there seems to be a lot of different names for what is essentially a particular function and its derivative.

First, let's cross off one of the names you mention in the title: cumulative density function. That just doesn't make sense. A density function concerns itself with local properties of a phenomenon, while a cumulative function would be looking at global properties. A parallel term in calculus perhaps would be "integral derivative function", which makes no sense. So, the term "cumulative density" is just incorrect.

Now let's deal with the remaining terms: cumulative distribution function, distribution function, probability density function, and probability mass function.

The terms cumulative distribution function, probability density function, and probability mass function have unique meanings, which I will try to explain below.

I can't remember seeing the term "distribution function" being used as an equivalent to "probability density function" and "probability mass function", but it doesn't mean it is not used, considering that so many different disciplines use these concepts. But in measure theoretical probability, the term "distribution function" always refers to "cumulative distribution function", and the "cumulative" part is always dropped.

Next, let's define the terms and see what is their relationship. If you really want a truly complete answer, you'll need to know some measure theory, but I'm going to try to give a reasonable answer without using any measure theory. Nevertheless, a good understanding of calculus is inevitable.

If $X$ is a random variable, then its cumulative distribution function (CDF) is a function $F_X$ defined on real numbers as follows: \begin{equation} F_X(x) = P(X \leq x) \end{equation}

It is not hard to see that $F_X$ is increasing: if $a < b$, then $F_x(a) \leq F_X(b)$. We can also show that $F_X$ is right-continuous at every $x$, meaning that if you approach $x$ through values larger than $x$, you'll get $F_X(x)$. In mathematical notation, this is written as \begin{equation} \lim_{z \to x^+} F_X(z) =F_X(x). \end{equation}

But $F_X$ can have jump points: points where it is discontinuous. Since $F_X$ is continuous from right everywhere, these discontinuities must be from the left. What that means is that if you approach such a point (say $a$) through values smaller than $a$, then the value of $F_X$ at those points does not approach the value of $F_X(a)$.

As a simple example, consider the random variable $X$ that always takes the value $3$. Then it is easy to see that $F_X(x) = 0$ for $x < 3$ and $F_X(x) = 1$ for $3 \leq x$. If you approach 3 via values larger than 3, then $F_X(x) = 1$ and the closer you get to 3, the value of $F_X(x)$ stays at 1. However, if you approach 3 through values less than $3$, then $F_X(x) = 0$ for these $x$, and regardless of how close to $3$ you get, you'll still get $F_X(x) = 0$. In mathematical notation, we write that as $F_X(3+) = F_X(3) = 1$ and $F_X(3-) = 0$.

In general, if we indicate by $F_X(a-)$ the value that $F_X(x)$ approaches to as $x$ approaches $a$ through values smaller than $a$, then the jump points of $F_X$ are those points $a$ such that $F_X(a) - F_X(a-) \neq 0.$ But remember that $F_X$ is increasing, and that implies that $F_X(a) - F_X(a-) > 0$ at jump points.

Another corollary of the increasing property of $F_X$ is that it can have at most countably many jump points. That means that we can put them in a list (though the list can be infinite). Let's assume that $a_1,a_2,\ldots,$ is the (possibly infinite) set of jump points of $F_X$. We will try to extract that part of $F_X$ that corresponds to the $a_i$s.

Define a point mass at $a$ to be the following function: \begin{equation} \delta_a(x) = \left\{ \begin{array}{ll} 0 & \mbox{if } x < a \\ 1 & \mbox{if } x \geq a. \end{array} \right. \end{equation}

Let $F_X(a_i) - F_X(a_i-) = b_i$. Define the function $F_X^d$ as follows. \begin{equation} F_X^d(x) = \sum_i b_i \delta_{a_i}(x). \end{equation}

If $F_X(x) = F^d_X(x)$ for all $x$, then $X$ is called a discrete random variable, and the function
\begin{equation} p_X(x) = \left\{ \begin{array}{ll} 0 & \mbox{if } x \neq a_1,a_2,\ldots \\ b_i & \mbox{if } x = a_i \qquad i = 1,2,\ldots \end{array} \right. \end{equation} is called the probability mass function of $X$.

If $F_X$ is differentiable function, with $F_X' = f$, then $f$ is called the probability density function. It is easy to see from the fundamental theorem of calculus that for any $a$ and $b$, \begin{equation} \int_a^b f(x)dx = F_X(b) - F_X(a) = P(a < X \leq b). \end{equation}

Notice that there is a vast sea between between when a probability mass function is defined ($F_X$ is a sum of point masses), and when a probability density function is defined ($F_X$ is differentiable). To understand what is in between, you need to study probability from a measure theoretical perspective.

Related Question