Looking at the links you provided, and what I'v seen in the past, there seems to be a lot of different names for what is essentially a particular function and its derivative.
First, let's cross off one of the names you mention in the title: cumulative density function. That just doesn't make sense. A density function concerns itself with local properties of a phenomenon, while a cumulative function would be looking at global properties. A parallel term in calculus perhaps would be "integral derivative function", which makes no sense. So, the term "cumulative density" is just incorrect.
Now let's deal with the remaining terms: cumulative distribution function, distribution function, probability density function, and probability mass function.
The terms cumulative distribution function, probability density function, and
probability mass function have unique meanings, which I will try to explain below.
I can't remember seeing the term "distribution function" being used as an equivalent to "probability density function" and "probability mass function", but it doesn't mean it is not used, considering that so many different disciplines use these concepts. But in measure theoretical probability, the term "distribution function" always refers to "cumulative distribution function", and the "cumulative" part is always dropped.
Next, let's define the terms and see what is their relationship. If you really want a truly complete answer, you'll need to know some measure theory, but I'm going to try to give a reasonable answer without using any measure theory. Nevertheless, a good understanding of calculus is inevitable.
If $X$ is a random variable, then its cumulative distribution function (CDF) is a function $F_X$ defined on real numbers as follows:
\begin{equation}
F_X(x) = P(X \leq x)
\end{equation}
It is not hard to see that $F_X$ is increasing: if $a < b$, then $F_x(a) \leq F_X(b)$. We can also show that $F_X$ is right-continuous at every $x$, meaning that if you approach $x$ through values larger than $x$, you'll get $F_X(x)$. In mathematical notation, this is written as
\begin{equation}
\lim_{z \to x^+} F_X(z) =F_X(x).
\end{equation}
But $F_X$ can have jump points: points where it is discontinuous. Since $F_X$ is continuous from right everywhere, these discontinuities must be from the left. What that means is that if you approach such a point (say $a$) through values smaller than $a$, then the value of $F_X$ at those points does not approach the value of $F_X(a)$.
As a simple example, consider the random variable $X$ that always takes the value $3$. Then it is easy to see that $F_X(x) = 0$ for $x < 3$ and $F_X(x) = 1$ for $3 \leq x$. If you approach 3 via values larger than 3, then $F_X(x) = 1$ and the closer you get to 3, the value of $F_X(x)$ stays at 1. However, if you approach 3 through values less than $3$, then $F_X(x) = 0$ for these $x$, and regardless of how close to $3$ you get, you'll still get $F_X(x) = 0$. In mathematical notation, we write that as $F_X(3+) = F_X(3) = 1$ and
$F_X(3-) = 0$.
In general, if we indicate by $F_X(a-)$ the value that $F_X(x)$ approaches to as $x$ approaches $a$ through values smaller than $a$, then the jump points of
$F_X$ are those points $a$ such that $F_X(a) - F_X(a-) \neq 0.$ But remember that $F_X$ is increasing, and that implies that $F_X(a) - F_X(a-) > 0$ at jump points.
Another corollary of the increasing property of $F_X$ is that it can have at most countably many jump points. That means that we can put them in a list (though the list can be infinite). Let's assume that $a_1,a_2,\ldots,$ is the (possibly infinite) set of jump points of $F_X$. We will try to extract that part of $F_X$ that corresponds to the $a_i$s.
Define a point mass at $a$ to be the following function:
\begin{equation}
\delta_a(x) =
\left\{
\begin{array}{ll}
0 & \mbox{if } x < a \\
1 & \mbox{if } x \geq a.
\end{array}
\right.
\end{equation}
Let $F_X(a_i) - F_X(a_i-) = b_i$. Define the function $F_X^d$ as follows.
\begin{equation}
F_X^d(x) = \sum_i b_i \delta_{a_i}(x).
\end{equation}
If $F_X(x) = F^d_X(x)$ for all $x$, then $X$ is called a discrete random variable, and the function
\begin{equation}
p_X(x) =
\left\{
\begin{array}{ll}
0 & \mbox{if } x \neq a_1,a_2,\ldots \\
b_i & \mbox{if } x = a_i \qquad i = 1,2,\ldots
\end{array}
\right.
\end{equation}
is called the probability mass function of $X$.
If $F_X$ is differentiable function, with $F_X' = f$, then $f$ is called the probability density function. It is easy to see from the fundamental theorem of calculus that for any $a$ and $b$,
\begin{equation}
\int_a^b f(x)dx = F_X(b) - F_X(a) = P(a < X \leq b).
\end{equation}
Notice that there is a vast sea between between when a probability mass function is defined ($F_X$ is a sum of point masses), and when a probability density function is defined ($F_X$ is differentiable). To understand what is in between, you need to study probability from a measure theoretical perspective.
Best Answer
In probability theory, there is nothing called the cumulative density function as you name it. There is a very important concept called the cumulative distribution function (or cumulative probability distribution function) which has the initialism CDF (in contrast to the initialism pdf for the probability density function). The definition of the CDF $F_X(u)$ of a random variable $X$ is that the value of this function at the argument $u$ (here $u$ can be any real number) is the probability of the event $(X \leq u)$, the probability that the random variable $X$ is no larger than the real number $u$. Using symbols instead of words, we have that $$F_X(u) = P(X \leq u), -\infty < u < \infty.\tag{1}$$
Every random variable (no matter what kind) has a CDF, but pdfs generally are defined only for random variables that are called continuous random variables. So what are continuous random variables? Well, these are random variables that can take on every possible real number value in a continuum which for our purposes can be taken to be the entire real line or an interval $(a,b)$ or $(a,b]$ etc of the real line and whose CDF $F_X(u)$ is a continuous function of $u$ for all values of $u$, and furthermore is also differentiable at every $u$ except possibly for a finite number of points. Keep in mind that the CDF is continuous even at these oddball points, it is just that the CDF is not differentiable at those points. In fact, the reason why the CDF of a continuous random variable is a continuous but non-differentiable function at a point $u_1$ is that the "derivative on the right" does not equal the "derivative on the left".
What's the point of having a differentiable function if one doesn't differentiate it? The pdf $f_X(u)$ of a continuous random variable $X$ is defined to be the derivative of the CDF $F_X(u)$ at every point at which the CDF is differentiable, and at those few points were the derivative does not exist, one can define the value of the pdf to be any number one likes; the choice won't matter in the least. But it is prudent to choose a nonnegative number so that one can fearlessly claim that $f_X(u) \geq 0$ for all real numbers $u$.
Once again, you seem to have some badly mangled notions. Let's start with the probability that a random variable $X$ takes on values in the interval $(a,b]$. Now, the event $(X\leq b)$ is the disjoint union of the events $(X\leq a)$ and $(a < X \leq b)$ and so we have that \begin{align} P(X \leq b) &= P(X \leq a) + P(a < X \leq b)\\ F_X(b) &= F_X(a) + P(a < X \leq b) & {\scriptstyle{\text{on using }} (1)} \end{align} and so we can conclude that $$P(a < X \leq b) = F_X(b) - F_X(a). \tag{2}$$ Equations $(1)$ and $(2)$ hold for all random variables. But consider $(2)$ for the case of a continuous random variable and let's ask what happens in the limit as $a$ approaches $b$ from below. Well, as $a$ gets closer and closer to $b$, every real number that is strictly smaller than $b$ gets eliminated as $a$ moves past it as it approaches $b$. But $b$ itself never gets eliminated. We conclude that the limit of the event is the set containing the single number $b$ all by itself, and so $$P(X=b) = F_X(b) - \lim_{a\uparrow b}F_X(a) = F_X(b) - F_X(b) = 0.$$ Remember that $F_X(u)$ is a continuous function which makes that limit evaluation work the way that is stated. We conclude that
So, where did all the probability disappear to? Well, for a continuous random variable, probability resides in the intervals of the real line. As a specific case, consider a continuous random variable with CDF $$F_X(u) = \begin{cases}0, & u < 0,\\u, & 0 \leq u < 1,\\1, & u \geq 1.\end{cases}$$ Notice that $F_X(u)$ is continuous at $u=0$ and $u=1$ but is not differentiable at those points. Thus, we readily get that $f_X(u)$ has value $1$ for $0 < u < 1$ and value $0$ for $u < 0$ and $u > 1$. Since the derivative of $F_X(u)$ is undefined at $u=0$ and $u=1$, we set $f_X(0) = f_X(1) = 1$. Now, for $0 < a < b < 1$, we have that $$P(a < X \leq b) = F_X(b)-F_X(a) = b-a,$$ or in words,
This observation gives some intuition for the assertion that $P(X=b)$ has value $0$: the single point $b$ constitutes an interval of zero length and hence has probability $0$. More generally, the Fundamental Theorem of Calculus tells us that $$P(a < X \leq b) = F_X(b)-F_X(a) = \int_a^b f_X(u) \,\mathrm du$$ which gives us $b-a$ for the pdf introduced as an example but requires more strenuous evaluation of integrals in the general case.
Finally, I want to make the point that asking for the probability that a continuous random variable has value exactly $b$ is a pretty meaningless question in that in practice one can never tell whether the event $(X=b)$ in question has occurred or not. How can we determine whether the observed value of $X$ equals $\pi$ or not? All we have as the observed value is $X$ to some degree of precision, and that number is never going to be equal to $\pi$. Even if one imagines that $X$ is given with infinite precision, it will require an infinite amount of time to compare the infinitely many digits of $X$ with $3.14159126\ldots$ and we cannot stop and say that the digits match thus far and so we will assume that the match extends all the way! What is of interest is the probability that $X$ is approximately $b$ --- the probability that $X$ lies in a short interval containing $b$ (possibly centered on $b$ --- or that $X$ is no larger than $b$, or that $X$ is larger than $b$ and all these probabilities can be readily computed from the CDF, or with a little effort by integrating the pdf over the intervals of interest.