Solved – Empirical CDF vs CDF

cumulative distribution functiondistributionsempirical-cumulative-distr-fnterminology

I'm learning about the Empirical Cumulative Distribution Function. But I still don't understand

  1. Why is it called 'Empirical'?

  2. Is there any difference between Empirical CDF and CDF?

Best Answer

Let $X$ be a random variable.

  • The cumulative distribution function $F(x)$ gives the $P(X \leq x)$.
  • An empirical cumulative distribution function function $G(x)$ gives $P(X \leq x)$ based on the observations in your sample.

The distinction is which probability measure is used. For the empirical CDF, you use the probability measure defined by the frequency counts in an empirical sample.

Simple example (coin flip):

Let $X$ be a random variable denoting the result of a single coin flip where $X=1$ denotes heads and $X=0$ denotes tails.

The CDF for a fair coin is given by: $$ F(x) = \left\{ \begin{array}{ll} 0 & \text{for } x < 0\\ \frac{1}{2} & \text{for } 0 \leq x < 1 \\1 & \text{for } 1 \leq x \end{array} \right. $$

If you flipped 2 heads and 1 tail, the empirical CDF would be: $$ G(x) = \left\{ \begin{array}{ll} 0 & \text{for } x < 0\\ \frac{2}{3} & \text{for } 0 \leq x < 1 \\1 & \text{for } 1 \leq x \end{array} \right. $$

The empirical CDF would reflect that in your sample, $2/3$ of your flips were heads.

Another example ($F$ is CDF for normal distribution):

Let $X$ be a normally distributed random variable with mean $0$ and standard deviation $1$.

The CDF is given by:

$$F(x) = \int_{-\infty}^x \frac{1}{\sqrt{2\pi}} e^{\frac{-x^2}{2}}$$

Let's say you had 3 IID draws and obtained the values $x_1 < x_2 < x_3$. The empirical CDF would be: $$ G(y) = \left\{ \begin{array}{ll} 0 & \text{for } y < x_1\\ \frac{1}{3} & \text{for } x_1 \leq y < x_2 \\\frac{2}{3} & \text{for } x_2 \leq y < x_3 \\1 & \text{for } x_3 \leq y \end{array} \right. $$

With enough IID draws (and certain regularity conditions are satisfied), the empirical CDF would converge on the underlying CDF of the population.