[Math] Chernoff-Hoeffding-like bound for KL divergence of sampled distribution and true distribution

probabilityreference-request

Let $X_1, X_2, \ldots$ be i.i.d. Bernoulli random variables with mean $\mu$, and $\hat{\mu}_n = \frac{1}{n}\sum_i X_i$. Then the Chernoff-Hoeffding bound says that for all $c > 0$,
$$
P\left(\hat{\mu}_n – \mu > c\right) \leq e^{-2 n c^2}.
$$
Now, is there any similar bound for $KL(\hat{\mu}_n||\mu)$ instead of $(\hat{\mu}_n – \mu)$? [NOTE: $KL(\hat{\mu}_n||\mu)$ means the KL-divergence between Bernoulli($\hat{\mu}_n$) and Bernoulli($\mu$).]

One thing I know is that for Bernoulli, $KL(p||q) \geq 2(p-q)^2$, and hence,
$$
P\left(\hat{\mu}_n – \mu > c\right) \leq P\left(KL(\hat{\mu}_n || \mu) > 2c^2\right),
$$
but it doesn't help.

Thanks.

Best Answer

I don't follow the question completely; in particular, the expression $\newcommand{\KL}{\mathrm{KL}}$ $\KL(\hat \mu \| \mu)$ does not make sense to me because $\mu$ is a real number rather than a distribution. (Edit: The question is now edited.)

Nevertheless I will mention that Sanov's theorem seems to be related to what the OP has in mind. Suppose that the true distribution is $q$ and the empirical distribution is $\hat p$. Sanov's theorem allows us to measure the KL divergence between $\hat p$ and $q$; in fact, it does a bit more, as we will see. (I am quoting the theorem from the wikipedia page.)

Sanov's theorem. Let $A$ be a set of probability distributions over an alphabet $X$, and let $q$ be an arbitrary distribution over $X$ (where $q$ may or may not be in $A$). Suppose $x_1, x_2, \ldots, x_n$ are $n$ i.i.d. samples from $q$. Then the probability that the empirical distribution of the sample lies inside $A$ is at most $(n+1)^{|X|} 2^{-n \delta}$, where $$ \delta := \inf_{p \in A} \ \KL(p \| q) .$$

The OP's question then follows as a

Corollary. If $\hat p$ is the empirical distribution, then $$ \Pr[\KL(\hat p \| q) \geqslant \delta] \leqslant (n+1)^{|X|} 2^{- n \delta}. $$

Proof. Apply Sanov's theorem with $A$ defined as the set of distributions $p$ such that $\KL(p \| q) \geqslant \delta$.

Related Question