Is Kullback-Leibler Divergennce not equal to Relative Entropy

convex optimizationinformation theorymachine learning

In many books, Kullback-Leibler Divergence is equal to Relative Entropy.
$$
D_{kl}(u,v) = \sum_{i=1}^n(u_ilog(u_i/v_i).
$$

However, I find in the book, Convex Optimization (Stephen Boyd) page 90, the KL Divergence is defined as,
$$
D_{kl}(u,v) = \sum_{i=1}^n(u_ilog(u_i/v_i)-u_i+v_i).
$$

Why KL Divergence has these two different definition? Which one is correct?

Best Answer

They're equivalent because $\sum_i u_i=\sum_i v_i=1$.

Related Question