Solved – shannon entropy, metric entropy and relative entropy

entropyinformation theoryprobability

Please explain the following for me:

Example generated using online calculator

sequence = aaabbcddddefgghhijjk

sequence length = 20

unique characters in sequence = 11

frequencies of unique characters:

a = 0.15, b = 0.1, c = 0.05, d = 0.2, e = 0.05, f = 0.05, g = 0.1, h = 0.1, i = 0.05, j = 0.1, k = 0.05

for which we get entropy as:
$$
H(X) = -[(0.15log_2 0.15)+(0.1log_20.1)+(0.05log_20.05)+(0.2log_20.2)+
(0.05log_20.05)+(0.05log_220.05)+(0.1log_20.1)+(0.1log_20.1)+(0.05log_20.05)+ (0.1log_20.1)+(0.05log_20.05)]
$$

$$
H(X) = -[(-0.411)+(-0.332)+(-0.216)+(-0.464)+(-0.216)+(-0.216)+(-0.332)+(-0.332)+(-0.216)+ (-0.332)+(-0.216)]
$$
$$
H(X) = -[-3.28418]
$$
$$
H(X) = 3.28418
$$
If the metric entropy is the ratio of H(X)/sequence length:
$$
Metric entropy = \frac{3.28418}{20} = 0.16421
$$

What is the ratio of entropy and the number of unique characters? In the case of this example:
$$
\frac{3.28418}{11} = 0.2986
$$

Could this be considered relative entropy?

Best Answer

In case those who voted this question would like an answer, here's what I've learned since posting it:

The relative entropy I was asking about is what's commonly referred to as normalised entropy as the term "relative entropy" is also used for Kullback–Leibler divergence.

Normalised entropy is the ratio between observed entropy and the theoretical maximum entropy for a given system. So to normalise observed entropy, we first need to calculate maximum entropy for the given set of unique characters in the example as follows: $$ H_{max} = log_2(11)$$ $$ = 3.45943$$

Now we get normalised entropy as: $$ \frac{3.28418}{3.45943} = 0.94934 $$

This is the randomness in the sequence generated relative to the number of unique characters made available. If we were interested in the randomness of this sequence relative to all lower case English alphabet, we would get:

$$ \frac {3.28418}{log_2(26)}$$ $$ = \frac{3.28418}{4.70044}$$ $$ = 0.69869$$

Related Question