Random Variables – Differences Between Absolutely Continuous and Continuous Random Variables

distributionsprobabilityrandom variable

In the book Limit Theorems of Probability Theory by Valentin V. Petrov, I saw a distinction between the definitions of a distribution being "continuous" and "absolutely continuous", which is stated as follows:

  1. "…The distribution of the random variable $X$ is said to be continuous if $P\left(X \in B\right)=0$ for any finite or countable set $B$ of points of the real line. It is said to be absolutely continuous if $P\left(X \in B\right)=0$ for all Borel sets $B$ of Lebesgue measure zero…"

The concept I am familiar with is:

  1. "If a random variable has a continuous Cumulative Distribution Function, then it is absolutely continuous."

My questions are: are the two descriptions about "absolute continuity" in (1) and (2) talking about the same thing? If yes, how can I translate one explanation into the other one?

Best Answer

The descriptions differ: only the first one $(*)$ is correct. This answer explains how and why.


Continuous distributions

A "continuous" distribution $F$ is continuous in the usual sense of a continuous function. One definition (usually the first one people encounter in their education) is that for each $x$ and for any number $\epsilon\gt 0$ there exists a $\delta$ (depending on $x$ and $\epsilon$) for which the values of $F$ on the $\delta$-neighborhood of $x$ vary by no more than $\epsilon$ from $F(x)$.

It is a short step from this to demonstrating that when a continuous $F$ is the distribution of a random variable $X$, then $\Pr(X=x)=0$ for any number $x$. After, all, the continuity definition implies you can shrink $\delta$ to make $\Pr(X\in (x-\delta, x+\delta))$ as small as any $\epsilon \gt 0$ and since (1) this probability is no less than $\Pr(X=x)$ and (2) $\epsilon$ can be arbitrarily small, it follows that $\Pr(X=x)=0$. The countable additivity of probability extends this result to any finite or countable set $B$.

Absolutely continuous distributions

All distribution functions $F$ define positive, finite measures $\mu_F$ determined by

$$\mu_F((a,b]) = F(b) - F(a).$$

Absolute continuity is a concept of measure theory. One measure $\mu_F$ is absolutely continuous with respect to another measure $\lambda$ (both defined on the same sigma algebra) when, for every measurable set $E$, $\lambda(E)=0$ implies $\mu_F(E)=0$. In other words, relative to $\lambda$, there are no "small" (measure zero) sets to which $\mu_F$ assigns "large" (nonzero) probability.

We will be taking $\lambda$ to be the usual Lebesgue measure, for which $\lambda((a,b]) = b-a$ is the length of an interval. The second half of $(*)$ states that the probability measure $\mu_F(B)=\Pr(X\in B)$ is absolutely continuous with respect to Lebesgue measure.

Absolute continuity is related to differentiability. The derivative of one measure with respect to another (at some point $x$) is an intuitive concept: take a set of measurable neighborhoods of $x$ that shrink down to $x$ and compare the two measures in those neighborhoods. If they always approach the same limit, no matter what sequence of neighborhoods is chosen, then that limit is the derivative. (There's a technical issue: you need to constrain those neighborhoods so they don't have "pathological" shapes. That can be done by requiring each neighborhood to occupy a non-negligible portion of the region in which it lies.)

Differentiation in this sense is precisely what the question at What is the definition of probability on a continuous distribution? is addressing.

Let's write $D_\lambda(\mu_F)$ for the derivative of $\mu_F$ with respect to $\lambda$. The relevant theorem--it's a measure-theoretic version of the Fundamental Theorem of Calculus--asserts

$\mu_F$ is absolutely continuous with respect to $\lambda$ if and only if $$\mu_F(E) = \int_E \left(D_\lambda \mu_F\right)(x)\,\mathrm{d}\lambda$$ for every measurable set $E$. [Rudin, Theorem 8.6]

In other words, absolute continuity (of $\mu_F$ with respect to $\lambda$) is equivalent to the existence of a density function $D_\lambda(\mu_F)$.

Summary

  1. A distribution $F$ is continuous when $F$ is continuous as a function: intuitively, it has no "jumps."

  2. A distribution $F$ is absolutely continuous when it has a density function (with respect to Lebesgue measure).

That the two kinds of continuity are not equivalent is demonstrated by examples, such as the one recounted at https://stats.stackexchange.com/a/229561/919. This is the famous Cantor function. For this function, $F$ is almost everywhere horizontal (as its graph makes plain), whence $D_\lambda(\mu_F)$ is almost everywhere zero, and therefore $\int_{\mathbb{R}} D_\lambda(\mu_F)(x)d\lambda = \int_{\mathbb{R}}0 d\lambda = 0$. This obviously does not give the correct value of $1$ (according to the axiom of total probability).

Comments

Virtually all the distributions used in statistical applications are absolutely continuous, nowhere continuous (discrete), or mixtures thereof, so the distinction between continuity and absolute continuity is often ignored. However, failing to appreciate this distinction can lead to muddy reasoning and bad intuition, especially in the cases where rigor is most needed: namely, when a situation is confusing or nonintuitive, so we rely on mathematics to carry us to correct results. That is why we don't usually make a big deal of this stuff in practice, but everyone should know about it.

Reference

Rudin, Walter. Real and Complex Analysis. McGraw-Hill, 1974: sections 6.2 (Absolute Continuity) and 8.1 (Derivatives of Measures).

Related Question