[Math] Is the channel capacity (defined as the maximum of mutual information) always less than 1

information theory

The channel capacity is defined to be the maximum achievable rate and is equal to maximum over input distributions of the mutual information between output and input. i.e.

$C=\max_{P} I(X;Y)$

Also the information rate is defined as

$R=\frac{\log( M)}{n}$

where $M$ is the number of messages and $n$ is the blocklength.

Here is my question: If we have $M$ messages and $n$ as the blocklength, we would need $\log_2(M)$ bits to represents the $M$ messages. To avoid channel error however, we may add redundancy and transmit the messages in $n$ bits for $n>\log_2 (M)$ which results in $R\leq 1$. Consequently $C\leq 1$.

Is there any other ways of coding (not necessarily adding redundancy) such that the capacity is larger than $1$?

I mean, with this definition of rate, I always think of it to be less than 1 but from the mutual information point of view, I don't see a reason that it should be less than 1. Can please someone clarify?

Best Answer

For a channel with input alphabet $\cal X$ and output alphabet $\cal Y$, the channel capacity is upper bounded by $\log \min (|\cal X|, |\cal Y|)$.

For example, take a channel with same input and output alphabet which just outputs the input exactly. Then, its capacity is $\log |\cal X|$ which is achieved by an uniform distribution on the inputs (in fact, for any (weakly) symmetric channel, the uniform distribution achieves capacity, but thats another mattter). This can be larger than 1 bit.

A rate R is achievable if there exists a $(\lceil 2^{nR} \rceil,n)$ codes whose maximal error probability goes to zero. A $(M,n)$ code for a channel (i.e. a specification of $p(y|x)$) is a mapping from $\{1,\ldots,M\}$ to $\cal X^n$ (which forms the encoder) and a deterministic mapping from $\cal Y^n \to \{1,\ldots,M\}$ which forms the decoder. The capacity is the supremum of achievable rates. So, in the example I gave above, rates below $\log |X|$ are achievable (e.g. by repeating $n/2$ symbols twice, you get a rate of $\log |\cal X| /2$), but this is not the largest rate possible -- just sending the $n$ symbols as is gives you the full rate of $\log |\cal X|$.

Related Question