[Math] Finding the information sets of a linear code.

I'm trying to get a better understanding of linear codes, so I decided to work on problems from various textbooks. I'm having trouble understanding how to do this problem, and I was wondering if anyone can lead me in the right direction.

Problem The matrix $G = [I_{4} | A]$, where
$$ G = \left[ \begin{array}{cccc|ccc} 1 & 0 & 0 & 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1\end{array} \right] $$

is a genrator matrix in standard form for a $[7,4]$ binary code, denoted by $\mathcal{H}_3$. The parity check matrix for $\mathcal{H}_3$ is
$$H = [A^{T} | I_{3}] = \left[ \begin{array}{cccc|ccc} 0 & 1 & 1 & 1 & 1 & 0 & 0 \\ 1 & 0 & 1 & 1 & 0 & 1 & 0 \\ 1 & 1 & 0 & 1 & 0 & 0 & 1 \end{array} \right]. $$

Find at least four information sets in $\mathcal{H}_3$. Find at least one set of four coordinates that do not form an information set. $\blacksquare$

The book defines an information set as follows: Given a $[n,k]$ linear code $\mathcal{C}$, a generator matrix for $\mathcal{C}$ is any $k \times n$ matrix $G$ whose rows form a basis for $\mathcal{C}$. For any set of $k$ independent columns of $G$, the corresponding set of coordinates forms an information set for $\mathcal{C}$.

Any help would be greatly appreciated since I've been staring at this for quite some time. Thanks!

Best Answer

As I interpret your book's definition of an "information set", the "set of coordinates" that corresponds to a set of columns is literally the index of those columns in the matrix $G$. So if $$G=\left[\begin{array}{cccc|ccc} 1 & 0 & 0 & 0 & 0 & 1 & 1 \\ 0 & 1 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 \end{array}\right],$$ the "set of coordinates" corresponding to "the first, second, fourth, and seventh columns" is literally just (1,2,4,7). Since these columns are not linearly independent, this would be an example of a set of four coordinates which do not form an information set.

That $G$ has four independent columns is what allows it to uniquely encode a message of length four. If you take any four-digit message $m$ and multiply $mG$ then the resulting 7-digit codeword can be translated uniquely back to the original message; if $G$ had fewer than $4$ independent columns, this multiplication would yield the same result for more than one 4-digit message, and in a sense you would lose information. You could never be sure you decoded certain messages correctly.

Another way to think of the situation is that because $G$ has four independent columns, it has rank 4 and thus when you put it in row-reduced echelon form, it will be a standard generator matrix (the $G$ in this example already is a standard generator matrix). If it wasn't, though, and yet it still had four independent columns, you could convert it easily. Now, if you look at what happens when you multiply $mG$, the result is the message $m$ itself with three additional digits tacked onto the end. If the columns of the identity matrix were at other coordinates, the codeword would have the message $m$ rearranged as those columns are rearranged, with additional digits inserted where the other columns are.

I hope this rambling was useful rather than even-more confusing. This document has a pretty good explanation of information sets as well (see p. 39). (I'm not sure who to credit, but the document originally came from Jonathan Hall's MSU webpage)

Best Answer

Related Solutions

[Math] Finding parity check matrix for linear code

Related Question