Normalized Concentration Inequality – Empirical CDF (iid Sum)

inequalitiesmeasure-concentrationpr.probabilityst.statisticsstochastic-processes

Consider the empirical and population CDF,
$$
F_n(t) = \frac{1}{n} \sum_{i=1}^n 1\{X_i \leq t\} \quad \mbox{and} \quad
F(t) = \mathbb{E} [F_n(t)],
$$

where above $X_1, \dots, X_n$ are iid, real-valued random variables and the expectation is taken with respect to the same distribution.

It is well-known (the DKW inequality with sharp constants due to Massart) that
$$
\mathbb{P}\big(\sqrt{n} \|F_n – F\|_\infty > \lambda\big) \leq 2 \exp(-2\lambda^2), \quad \mbox{for all}~\lambda > 0.$$

Above, $\|\cdot\|_\infty$ denotes the supremum norm.

I am wondering if there is a normalized version of this inequality available. Specifically, define
$$
\widehat{ \sigma}_n(t) = F_n(t) (1 – F_n(t))
$$

This can be seen as a sample approximation to the true variance since $\widehat{ \sigma}_n(t) \to F(t)(1- F(t))$ a.s. by continuous mapping and the strong law as $n \to \infty$. The quantity $F(t) (1-F(t))$ is the variance of the random variable $B(t) := 1\{X \leq t\}$.

Question: Consider the normalized quantity
$$
\sup_{t} \frac{|F_n(t) – F(t)|}{\sqrt{\widehat{\sigma}_n(t)}}
$$

Does a similar inequality exist?

Best Answer

If $F(t)=1/n$, the Poisson/binomial approximation gives that $Y_n = nF_n(t)$ is Binomial($n,\frac1n$) and converges to Poisson(1), say in total variation distance. In particular, $P(Y_n=0)=P(F_n(t)=0)\to e^{-1}$, the pmf of Poisson(1) at 0. The ratio is infinite with constant probability for a single $t$ unless a different normalization is used or the range of $t$ is restricted.

Related Question