Solved – KL divergence calculation

distance-functionsdistributionsinformation retrievalmachine learning

I am wondering that how one can calculate KL-divergence on two probability distributions. For example, if we have

t1 = 0.4, 0.2, 0.3, 0.05, 0.05
t2 = 0.23, 0, 0.14, 0.17

The formula is bit complicated for me 🙁

Best Answer

Using brute force and the first formula here based on the first formula for the Kullback-Leibler divergence, you are starting from two multisets each with 5 values, 3 of which are shared between them. So the combination of them is the multiset $$M={0, 0.05, 0.05, 0.1, 0.2, 0.2, 0.3, 0.3, 0.4, 0.4}$$

so using $D_{\mathrm{KL}}(P\|Q) = \sum_i P(i) \log \frac{P(i)}{Q(i)}$

$$JSD(t_1 \parallel t_2)= \frac{1}{2}D_{\mathrm{KL}}(t_1 \parallel M)+\frac{1}{2}D_{\mathrm{KL}}(t_2 \parallel M)$$ $$=\frac{1}{2}\left(1\cdot\frac{2}{5} \log\left(\frac{2/5}{2/10}\right) +3\cdot\frac{1}{5} \log\left(\frac{1/5}{2/10}\right)\right) $$ $$+\frac{1}{2}\left(2\cdot\frac{1}{5} \log\left(\frac{1/5}{1/10}\right) +3\cdot\frac{1}{5} \log\left(\frac{1/5}{2/10}\right)\right) $$ $$= \dfrac{2}{5}\log(2) \approx 0.277$$

though you may want to check this. Other calculations, such as using Shannon entropy should produce the same result.

Related Question