Mathematical Statistics – Defining Discrete and Continuous Variables

continuous datadefinitiondiscrete datamathematical-statisticsrandom variable

The definition of a continuous variable in our class seems to be, well, not a definition, as there are exceptions not included in its definition.

I am a 4th year math student and find it appalling that such a rinky dink thing can be a definition.
Could someone possible give me a definition that is capable of differentiating between continuous variables and discrete ones that is completely accurate or give me a list of all the continuous variables that need to be treated as discrete and when. So far I have that money is somehow discrete as well as time sometimes?

Concentrations I'm not sure about (e.g parts per million) but weight, length and temperature seem to be continuous. Time seems to be completely confused.

They claim that a continuous random variable is one that would take an infinite amount of time to list all possible values, whereas a discrete one is one that can be counted.

Best Answer

A random variable $R$ is said to be continuous if for every real number $t,$ the probability that $R$ equals $t$ is zero $P(R = t) = 0.$ A random variable $R$ is said to be discrete if there exists a countable set of values $t_1, \ldots, t_n, \ldots$ such that $P(R = t_i) > 0$ for all $i$ and $\sum\limits_i P(R = t_i) = 1.$ The Radon-Nikodym and Lebesgue Decomposition theorems show every the cumulative distribution function (a.k.a. CDF) of every random variable can be expressed as $$ F = aF_{ac} + b F_{dc} + c F_{pm} $$ where $a, b, c \geq 0$ and $a + b + c = 1,$ where $F_{ac}$ is the CDF of an absolutely continuous random variable (i.e. $F_{ac}$ admits a density), and $F_{dc}$ is the CDF of degenerated continuous random variable and $F_{pm}$ is the CDF of a discrete random variable (so pm stands for point-mass). It is hard to construct examples of degenerated continuous random variables for their CDF must be continuous, increasing, not constant, and have a zero derivative almost everywhere. A typical example is Cantor's Devil Staircase function (https://en.wikipedia.org/wiki/Cantor_function). So you usually assume that random variables are either absolutely continuous, discrete or mixture of these two types.

EDIT: this question received a lot of attention, so let me expand a bit. This definition is motivated on the 1D case (univariate random variables). The condition that $P(R = t) = 0$ for all $t$ signifies that the CDF of $R$ is a continuous function $\mathbf{R} \to [0,1].$ Indeed, it is a well-known fact that a CDF is non-decreasing function, a fortiori it can only have jump discontinuities. But a jump discontinuity of a CDF is precisely at the "atoms" of the distributions (an "atom" of a random variable $R$ is a value values $t$ such that $P(R = t) > 0$). To see this, we use that the CDF $F$ is already (by definition) continuous on the right, so that $F$ is continuous if and only if is continuous on the left. Now, $$ F(t) - F(t - \delta) = P(R \leq t) - P(R \leq t - \delta) = P(t - \delta < R \leq t), $$ by measure-theoretic properties of $P,$ the right hand side converges to $P(R = t),$ so that $F$ is continuous on the left if and only if $P(R = t) = 0,$ which is the main motivation to call an atomless random variable a "continuous random variable."