Below is an image cropped out of a slide set I prepared for a presentation to former coworkers at Nokia.
An explanation is due. This is what a check matrix of a product of two codes looks like. Here $H$ is a check matrix of the first factor code, and $K$ is a check matrix for the second factor. If $b_1,b_2,\ldots,b_{n_1}$ are the natural basis vectors of the ambient binary space of the former factor code, and $c_1,c_2,\ldots,c_{n_1}$ are a similar basis for the ambient space of the latter factor, then the bit ordering that I use corresponds to $b_1\otimes c_1$,
$b_2\otimes c_1$, $\ldots,b_{n_1}\otimes c_1$, $b_1\otimes c_2$,
$b_2\otimes c_2$, $\ldots,b_{n_1}\otimes c_2$, $\ldots,\ldots$,
$b_1\otimes c_{n_2}$,
$b_2\otimes c_{n_2}$, $\ldots,b_{n_1}\otimes c_{n_2}$.
The check equations in $H$ affect all the $n_2$ groups of $n_1$ bits (with a fixed $c$-factor). Similarly the check equations in $K$ affect all the $n_1$ groups on $n_2$ (with a fixed $b$-factor). It looks like in my image $n_1=10$ and $n_2=8$.
The reason why this works is that the upper (grey) block of the check matrix
defines the code $C_1\otimes \mathbb{F}_2^{n_2}$, and the lower (green) block
defines the code $\mathbb{F}_2^{n_1}\otimes C_2$. Together these check equations
define the intersection of those two subspaces, i.e. $C_1\otimes C_2$.
Caveat:
Here the number of $H$-checks is $r_1n_2$ and the number of $K$-checks is
$r_2n_1$, where $r_i=n_i-k_i$ is the number of redundant bits in the factor codes. Here
$$
n_1n_2-(r_1n_2+r_2n_1)=(k_1+r_1)(k_2+r_2)-(r_1(r_2+k_2)+r_2(k_1+r_1))=k_1k_2-r_1r_2
$$
This is less than the rank $k_1k_2$ of the product code, so we see that there will be several linear dependencies among the rows
of this huge matrix. If this is a concern for you, then you may need to follow up this with a row reduction step. The way I used these codes, it was (in a way) a benefit to have these redundant checks.
You need to iterate this construction to get check matrices for a product of more than two codes.
There is also a general method for encoding a product code given the check matrices $H$ and $K$. That encoding method allows us to simply write the payload bits to certain positions, but it is in a sense a 2-dimensional process as opposed to the usual 1-dimensional way of calculating the parity bits as functions of information bits. This is outlined in the three following figures.
I use the $(16,11,4)$ extended Hamming code $C$ as both $C_1$ and $C_2$. Here is a version of its check matrix (may be a bit non-standard version, but they are all equivalent, so):
The arrows in columns $7,11,13,14,15$ indicate the choice that these should be the check positions (start indexing from zero, as this is engineers/programmers we are talking about). Notice that those five columns form an indentity matrix, so they serve well in the role of check bits.
Next I present individual words of the product code as $16\times 16$ matrices. Such a matrix is a word of the product code $C\otimes C$, iff all the columns and all the rows are words of the code $C$. There are $k_1k_2=11^2$ information bits. I put these in positions where both the row and the column index is an information position. These could be filled in any which way we want.
The way I did the filling in this example is quite regular. I did it this way simply to make it easier for me to do the steps that follow. Also, this way it is easier to check that the process was done correctly. The question marks indicate a check position.
In the first encoding stage I simply fill in the rows where the necessary 11 information bits are already known. This gives me 11 filled rows leaving the check rows with `?'.
In the second encoding stage I then fill in the columns. As all the columns already have the 11 information bits there, this poses no problems.
This second stage created 5 rows that where previously filled with question marks. By (bi-)linearity these new rows will automatically also be words of $C$! In this sense each of those 5 rows thus also satisfies 5 extra parity check equations. These are the $r_1r_2$ extraneous check equatios azimut asked about.
My advice actually is to not use parity check matrices at all, when the code is very long (in thousands of bits). This applies with extra force to product codes, because encoding can be carried out more efficiently using $H$ and $K$ alone.
Best Answer
The proof seems wrong. It should be "Then, the columns $i_1,...,i_u$ of the parity check matrix must be linearly dependent." Also, one should put apart the null codeword. Corrected:
I agree that it can be confusing. I would rather prove it thus:
First assume $d(C)=d$. Then, there is some codeword $c_0$ with weight $d$. Then, because $c_0 H^t =0$, this implies that $d$ columns of $H$ (those corresponding to the non-zero elements of $c_0$) are LD (linearly dependent). Furthermore, there cannot be a set of $m$ LD columns, with $0<m<d$; because in that case we could write $ x H^t=0$ for some vector $x$ with $m<d$ non-zero elements - but the vectors that verify that equation are the codewords, and there is no (non null) codeword with weight less than $d$.
The other direction goes similarly. If there is some set of $d$ LD columns , but no such set of $0<m<d$ columns, the (non null) solutions of $x H^t$ have minium weight $d$ - but these are the codewords, hence the distance is $d$.