The distribution is simply the assignment of probabilities to sets of possible values of the random variable. If I tell you how probable it is that a certain random variable is between $3$ and $5$, and also how probably it is that it's in every other possible set, then I've told you the distribution. Since I can't do this for every set individually, since there are infinitely many sets, perhaps a more down-to-earth way to say this is this: Suppose $X$ and $Y$ are random variables. If it is true of every set that the probability that $X$ is in that set is the same as the probability that $Y$ is in that same set, then $X$ and $Y$ have the same distribution.
A probability density function is a way of characterizing some distributions. For example, consider the function
$$
f(x) = \begin{cases} 0 & \text{if }x<0, \\ e^{-x} & \text{if }x\ge 0. \end{cases}
$$
To say that this is the probability density function of a random variable $X$ is to say that for every measurable set $A$ of real numbers,
$$
\Pr(X\in A) = \int_A f(x)\,dx.
$$
The probability assigned to each set $A$ is given by the integral above. A more concrete example:
$$
\Pr(3<X<5) = \int_3^5 e^{-x}\,dx\text{ and }\Pr(X\ge 2) = \int_2^\infty e^{-x}\,dx.
$$
Not every probability distribution has a density. Say we let $X$ be the number of aces when a die is thrown four times. Then $X\in\{0,1,2,3,4\}$. The probability distribution assigns a positive number to every set that intersects that last set. For example the set $\{x : x\ge 3.2\}$ intersects $\{0,1,2,3,4\}$ and thus the probability distribution of $X$ assigns a positive number to that set. But there is no function $f$ such that for every set $A$ we have $\int_A f(x)\,dx$ equal to the probability that $X\in A$.
PS prompted by comments below: To put it in a different kind of language: Say $m$ is a measure (not necessarily assigning finite measure to the whole space) on the set of all measurable subsets of a space $S$. A probability density with respect to the measure $m$ is a measurable function $f:S\to[0,\infty)$ such that the function
$$
A\mapsto \int_A f\,dm
$$
is a probability measure on the set of measurable subsets of $S$.
A probability distribution on $S$ is simply a probability measure on the set of all measurable subsets of $S$. But not quite "simply": The probability distribution of a random variable $X:\Omega\to S$ is the probability measure on measurable subsets of $S$ that assigns measure $P(\{\omega\in\Omega : X(\omega)\in A\})$ to each measurable subset $A$ of $S$.
PPS: When $f\ge0$ is a measurable function on Borel or Lebesgue-measurable subsets of $\mathbb R$, one sometimes refers to the "measure" $f(x)\,dx$, meaning the measure
$$
A\mapsto \int_A f(x)\,dx.
$$
If in addition $\displaystyle\int_{\mathbb R} f(x)\,dx=1$, so that $f$ is a probability density, then one may similarly refer to the "probability distribution" $f(x)\,dx$.
(Of course, not all probability distributions on Borel subsets of the real line are of this form.)
A (real valued) random variable is just a measurable map $X : \Omega \to \Bbb{R}$, where $(\Omega, \mathcal{F}, \Bbb{P})$ is an arbitrary probability space.
What we can then do is to consider the push-forward measure $\Bbb{P}_X = X_\ast \Bbb{P}$ of $\Bbb{P}$ by $X$. This is sometimes called the distribution of $X$. By definition, we have
$$
X_\ast \Bbb{P} (E) = \Bbb{P}(X^{-1}(E)) = \Bbb{P}(X \in E),
$$
for any (measurable) $E \subset \Bbb{R}$, so that (check this) $\Bbb{P}_X$ is a probability measure on $\Bbb{R}$. Note that the last expression is the one that most mathematicians in probability theory would use.
Now - as you already stated yourself - we can associate to every (locally finite) measure $\mu$ on $\Bbb{R}$ the distribution function $F = F_\mu$ of $\mu$, given by
$$
F_\mu (x) = \mu((-\infty, x]).
$$
In this way, we can also associate to the measure $\Bbb{P}_X$ the distribution function $F_X = F_{\Bbb{P}_X}$ which satisfies
$$
F_X (a) = \Bbb{P}_X ((-\infty, a]) = \Bbb{P}(X \in (-\infty, a]) = \Bbb{P}(X \leq a).
$$
Sometimes, this is also called the distribution of $X$ (note that we now call the measure $\Bbb{P}_X$ and it's distribution function $F_X = F_{\Bbb{P}_X}$ the "distribution of $X$". But as each of these two objects uniquely determines the other, this is not much of a problem).
Finally, all this has not much to do with the properties of $X$ as a function (i.e. with properties like continuity of $X$, ...). To see this, note that $\Omega$ is an arbitrary probability space. Hence, it does not make sense in general to talk about continuity of $X$, for example.
There is a different notion of a continuous random variable. Here, we call $X$ a continuous random variable, if the distribution function $F_X$ is continuous. This is equivalent to the condition $\Bbb{P}(X = a) = 0$ for all $a$ (why?) and thus has nothing to do with continuity of $X$ as a function (as above, this concept does not even make sense in general).
Short summary:
1) Each real-valued random variable comes with it's own cumulative distribution function. If we place additional assumptions on $X$, then it might be the case that this distribution function is given by the one associated to Lebesgue-measure. Note that we have to restrict Lebesgue-measure to (e.g.) an interval of length $1$ to do this, because otherwise this is no probability measure.
2) As explained above, the associated CDF is given by
$$
F_X (a) = \Bbb{P}(X \leq a).
$$
Best Answer
The distribution or the law of a random variable $X$ is a probability measure $\mathcal L$ on $(\mathbb R,\mathcal R)$, where $\mathcal R$ is the Borel $\sigma$-algebra on $\mathbb R$, such that $\mathcal L:\mathcal R\to[0,1]$. The cumulative distribution function (CDF) of a random variable $X$ is the function $F_X:\mathbb R\to[0,1]$ such that $F_X(x)=\Pr\{X\le x\}$ for $x\in\mathbb R$. If we know the distribution of the random variable $X$, then we also know the CDF of the random variable $X$. It also true that the CDF uniquely determines the distribution of the random variable $X$ (see this question).
I hope this helps.