Information Theory – Is Log the Only Choice for Measuring Information?

functionsinformation theorylogarithms

When we quantify information, we use $I(x)=-\log{P(x)}$, where $P(x)$ is the probability of some event $x$. The explanation I always got, and was satisfied with up until now, is that for two independent events, to find the probability of them both we multiply, and we would intuitively want the information of each event to add together for the total information. So we have $I(x \cdot y) = I(x) + I(y)$. The class of logarithms $k \log(x)$ for some constant $k$ satisfy this identity, and we choose $k=-1$ to make information a positive measure.

But I'm wondering if logarithms are more than just a sensible choice. Are they the only choice? I can't immediately think of another class of functions that satisfy that basic identity. Even in Shannon's original paper on information theory, he doesn't say it's the only choice, he justifies his choice by saying logs fit what we expect and they're easy to work with. Is there more to it?

Best Answer

We want to classify all continuous(!) functions $I\colon(0,1]\to\Bbb R$ with $I(xy)=I(x)+I(y)$. If $I$ is such a function, we can define the (also continouus) function $f\colon[0,\infty)\to \Bbb R$ given by $f(x)=I(e^{-x})$ (using that $x\ge 0$ implies $e^{-x}\in(0,1]$). Then for $f$ we have the functional equation $$f(x+y)=I(e^{-(x+y)})=I(e^{-x}e^{-y})=I(e^{-x})+I(e^{-y})=f(x)+f(y).$$ Let $$ S:=\{\,a\in[0,\infty)\mid \forall x\in[0,\infty)\colon f(ax)=af(x)\,\}.$$ Then trivially $1\in S$. Also, $f(0+0)=f(0)+f(0)$ implies $f(0)=0$ and so $0\in S$. By the functional equation, $S$ is closed under addition: If $a,a'\in S$ then for all $x\ge 0$, we have $$f((a+a')x)=f(ax+a'x)=f(ax)+f(a'x)=af(x)+a'f(x)=(a+a')f(x)$$ and so als $a+a'\in S$.

Using this we show by induction that $\Bbb N\subseteq S$: We have $1\in S$; and if $n\in S$ then also $n+1\in S$ (because $1\in S$).

Next note that if $a,b\in S$ with $b>0$ then for all $x\ge0$ we have $f(a\frac xb)=af(\frac xb)$ and $f(x)=f(b\frac xb)=bf(\frac xb)$, i.e., $f(\frac ab x)=\frac abf(x)$ and so $\frac ab\in S$. As $\Bbb N\subseteq S$, this implies that $S$ contains all positive rationals, $\Bbb Q_{>0}\subseteq S$.

In particular, if we let $c:=f(1)$, then $f(x)=cx$ for all $x\in \Bbb Q_{>0}$. As we wanted continuous functions, it follows that $f(x)=cx$ for all $x\in[0,\infty)$. Then $$ I(x)=f(-\ln x)=-c\ln x.$$

Remark: The request for continuity of $I$ (and hence $f$) is of course reasonable in the given context. But it turns out that much milder restrictons on $f$ suffice to enforce the result as found. It is only without any such restrictions that the Axiom of Choice supplies us with highly non-continuous additional solutions to the functional equation. The original remark that the logs fits what we expect and are easy to work with is quite an understatement if one even thinks of considering these non-continuous solutions.

Related Question