Information Theory – Calculating Jensen-Shannon Divergence for Three Distributions

distance-functionsinformation theory

I would like to calculate the jensen-shannon divergence for he following 3 distributions. Is the calculation below correct? (I followed the JSD formula from wikipedia):

P1  a:1/2  b:1/2    c:0
P2  a:0    b:1/10   c:9/10
P3  a:1/3  b:1/3    c:1/3
All distributions have equal weights, ie 1/3.

JSD(P1, P2, P3) = H[(1/6, 1/6, 0) + (0, 1/30, 9/30) + (1/9,1/9,1/9)] - 
                 [1/3*H[(1/2,1/2,0)] + 1/3*H[(0,1/10,9/10)] + 1/3*H[(1/3,1/3,1/3)]]

JSD(P1, P2, P3) = H[(1/6, 1/5, 9/30)] - [0 + 1/3*0.693 + 0] = 1.098-0.693 = 0.867

Thanks in advance…

EDIT Here's some simple dirty Python code that calculates this as well:

    def entropy(prob_dist, base=math.e):
        return -sum([p * math.log(p,base) for p in prob_dist if p != 0])

    def jsd(prob_dists, base=math.e):
        weight = 1/len(prob_dists) #all same weight
        js_left = [0,0,0]
        js_right = 0    
        for pd in prob_dists:
            js_left[0] += pd[0]*weight
            js_left[1] += pd[1]*weight
            js_left[2] += pd[2]*weight
            js_right += weight*entropy(pd,base)
        return entropy(js_left)-js_right

usage: jsd([[1/2,1/2,0],[0,1/10,9/10],[1/3,1/3,1/3]])

Best Answer

There is mistake in the mixture distribution. It should be $(5/18, 28/90, 37/90)$ instead of $(1/6, 1/5, 9/30)$ which does not sum up to 1. The entropy (with natural log) of that is 1.084503. Your other entropy terms are wrong.

I will give the detail of one computation:

$$H(1/2,1/2,0) = -1/2*\log(1/2) - 1/2*\log(1/2) + 0 = 0.6931472$$

In a similar way, the other terms are 0.325083 and 1.098612. So the final result is 1.084503 - (0.6931472 + 0.325083 + 1.098612)/3 = 0.378889