Information Entropy – Why It Can Be Greater Than 1

entropymathematical-statisticspython

I implemented the following function to calculate entropy:

from math import log

def calc_entropy(probs):
    my_sum = 0
    for p in probs:
        if p > 0:
            my_sum += p * log(p, 2)

    return - my_sum

Result:

>>> calc_entropy([1/7.0, 1/7.0, 5/7.0])
1.1488348542809168
>>> from scipy.stats import entropy # using a built-in package 
                                    # give the same answer
>>> entropy([1/7.0, 1/7.0, 5/7.0], base=2)
1.1488348542809166

My understanding was that entropy is between 0 and 1, 0 meaning very certain, and 1 meaning very uncertain. Why do I get measure of entropy greater than 1?

I know that if I increase size of log base, the entropy measure will be smaller, but I thought base 2 was standard, so I don't think that's the problem.

I must be missing something obvious, but what?

Best Answer

Entropy is not the same as probability.

Entropy measures the "information" or "uncertainty" of a random variable. When you are using base 2, it is measured in bits; and there can be more than one bit of information in a variable.

In this example, one sample "contains" about 1.15 bits of information. In other words, if you were able to compress a series of samples perfectly, you would need that many bits per sample, on average.

Related Question