Bipartite Version of Hamming Bound for Codewords with Large Hamming Distance

co.combinatoricscoding-theoryentropyhamming distanceit.information-theory

Update: In light of Fedor Petrov's answer, I added an additional requirement that all strings in $A$ and $B$ have Hamming weight exactly $n/2$, which hopefully makes the question more interesting.

Without this additional requirement on the Hamming weight, as Fedor Petrov pointed out, we can let $A$ contain all strings of Hamming weight $n/2+d/2$ and let $B$ contain all strings of Hamming weight $n/2-d/2$.

Question: Let $A,B\subset \{0,1\}^n$ be two subsets of binary strings of length $n$ with equal cardinality (i.e. $|A| = |B|$). Suppose they have the following properties:

  1. All strings in $A\cup B$ have Hamming weight exactly $n/2$.

  2. For any $a\in A$ and $b\in B$, the Hamming distance between $a$ and $b$ is at least $d$ for some $d = o(n)$.

Then, what is the largest possible size of $A$ and $B$?

My guess: My guess is something around $2^{n-\Theta(d)}$, which is the bound we get when we require the strings in $A$ to have pairwise Hamming distance at least $d$ (see Remark 2 below). However the proof in the pairwise case does not seem to apply here.

Remark 1: If the cardinalities of $A$ and $B$ were allowed to be lopsided, then it would be possible to make $A$ contain almost everything in $\{0,1\}^n$. Specifically, we could just let $B$ contain the all-zero string $0^n$, to which almost all strings in $\{0,1\}^n$ have Hamming distance $\Omega(n)$.

Remark 2: The answer to the following closely related question is characterized by the Hamming bound and Gilbert–Varshamov bound:

What is the maximum possible size of a subset $C\subset \{0,1\}^n$ of
lengh-$n$ binary strings with pairwise Hamming distance at least
$d$?

By the two bounds mentioned above, the answer is known to be between

$$
\left[\frac{2^n}{\sum_{j=0}^{d-1} \binom{n}{j}}, \frac{2^n}{\sum_{j=0}^{\frac{d-1}{2}} \binom{n}{j}} \right].
$$

Note that this impies that the largest size of $C$ is bounded by $2^{n-{\Theta}(d)}$.

Best Answer

If the Hamming distance between $A$ and $B$ is at least $d$, it yields that $B$ is disjoint from the $(d-1)$-neighborhood of $A$. By isoperimetric inequality for a Boolean cube (Harper's theorem), the smallest $d$-neighborhood of $A$ for a given size $|A|$ is attained when $A$ consists of all points with sum of coordinates at most $T$ and several points with sum of coordinates exactly $T+1$ (for appropriate $T$). In this case $B$ essentially consists of points with sum of coordinates at least $T+d$. Since we want $|A|=|B|$, we should take $T=n/2-d/2$. Then $|A|\sim 2^{n-\Theta(d^2/n)}$ by central limit theorem. This is best possible.

For the updated question, take two Hamming balls of radius $n/2−d/2$ centered in opposite points $(0,...,0,1,...,1)$ and $(1,...,1,0,...,0)$ (each center has weight $n/2$) and leave only points of weight $n/2$ in both balls. This still has the same asymptotics.

Related Question