Minimal mutual information for the Binary Symmetric Channel

information theory

I am working on the following exercise:

Let $X, Y$ be RVs with values in $\mathcal{X} = \mathcal{Y} = \{0, 1\}$ and let $p_X(0) = p$ and $p_X(1) = 1−p$. Let $\mathcal{C} = (X , P, Y)$ be the channel with input RV $X$, output RV $Y$ and transition matrix

$$P = \begin{bmatrix} q &1-q\\ 1-q &q \end{bmatrix}.$$

  1. Compute $I(X, Y )$.
  2. For which values of $q$ is $I(X; Y)$ minimal?

For 1. we note that

$$I(X;Y) = H(Y) – H(Y \mid X) = H(Y) – H(q,1-q).$$

I do not see how to do 2. however, could you help me?

Best Answer

I'm assuming you meant to say $\mathcal{X} = \mathcal{Y} = \{0,1\}$. If $Y=X$, i.e. the two random variables are equal, $I(X;Y)$ is fixed and $P$ is meaningless.

Your expansion doesn't really help because both $H(Y)$ and $H(Y|X)$ depend on $q$. Instead, let's look at $I(X;Y) = H(X) - H(X|Y)$ by the symmetry of mutual information. $H(X)$ is fixed for a fixed $p$, so we only need to maximize $H(X|Y)$ through $q$ to minimize $I(X;Y)$.

If $q = 0.5$, observing $Y$ doesn't give any information about $X$, so our best guess is equivalent to flipping a $p$-biased coin, using the outcome as our guess and just ignoring $Y$. So, for $q=0.5$, $H(X|Y) = H(X)$, which is the maximum because conditioning can't increase entropy. You can plug-in the joint distribution of $X,Y$ to verify this intuitive explanation.