[Math] Proof that Rényi divergence = KL divergence when $\alpha \rightarrow 1$

divergence-operatorlimitsstatistics

Kullback–Leibler divergence between two parametrized distributions is defined as:

$$
D_{KL}(q(\theta) || p(\theta)) = \int q(\theta) \log \frac{q(\theta)}{p(\theta)} d\theta
$$

Rényi divergence is defined as:

$$
D_{\alpha}(q(\theta) || p(\theta)) = \frac{1}{\alpha-1} \log\int p(\theta)^\alpha q(\theta)^{1-\alpha} d\theta
$$

It is known that the KL divergence is a particular case of Rényi divergence when $\alpha \rightarrow 1$.

But what is the proof for that?

Best Answer

Take the limit with L'Hopital: \begin{align*} \lim_{\alpha\rightarrow 1} D_\alpha(p||q) &= \lim_{\alpha\rightarrow 1} \frac{1}{\alpha-1}\log\int p(\theta)^\alpha q(\theta)^{1-\alpha}\,d\theta \\ &= \lim_{\alpha\rightarrow 1} \frac{\partial}{\partial \alpha} \log\int p(\theta)^\alpha q(\theta)^{1-\alpha}\,d\theta \\[2mm] &= \lim_{\alpha\rightarrow 1} \frac{\displaystyle\int p(\theta)^\alpha q(\theta)^{1-\alpha} [\log(p(\theta))-\log(q(\theta))] \,d\theta}{\displaystyle \int p(\theta)^\alpha q(\theta)^{1-\alpha}\,d\theta} \\ &= \frac{\displaystyle \int p(\theta)\log(p(\theta)/q(\theta))\,d\theta}{\displaystyle \int p(\theta)\,d\theta} \\[2mm] &=\int p(\theta)\log\left(\frac{p(\theta)}{q(\theta)}\right)\,d\theta \\[2mm] &= D_{\text{KL}}(p||q) \end{align*}