Choosing columns in array

combinatorics

In an array with four rows and $n\geq 1$ columns, some subset of cells are marked. Can we always choose some columns in such a way that for at least three rows, the number of chosen marked cells and the number of unchosen marked cells differ by at most $1$?

An initial try is to pick some three fixed row (for example the first three rows) and see if it is possible to choose the columns in the required way. However, this is not always the case: it could be that there are three columns, in the first row the first and second columns are marked, in the second row the second and third column, and in the third row the third and first columns. But maybe there is some clever way to pick the three rows?

Best Answer

Yes, we can.

Let $P(n)$ be as follows :

$P(n)$ : In the $4\times n$ array, we can choose some columns in such a way that for at least three rows, the number of chosen marked cells and the number of unchosen marked cells differ by at most $1$.

In the following, $[a,b]$ represents the cell in the $a$-th row and in the $b$-th column. Also, $[a,b]=[c,d]$ means that either that both $[a,b]$ and $[c,d]$ are marked or that both $[a,b]$ and $[c,d]$ are unmarked.

Claim : If $P(7),P(8)$ are true, then $P(n)$ is true for all $n\ge 9$.

Proof :

There are $2^3=8$ possibilities for three cells to be marked or unmarked. So, if $n\ge 9$, then, by the pigeonhole principle, there is at least one pair of distinct integers $(s,t)$ such that $[1,s]=[1,t],$$[2,s]=[2,t]$ and $[3,s]=[3,t]$. We may suppose that $[1,1]=[1,2],$$[2,1]=[2,2]$ and $[3,1]=[3,2]$. Now, consider the $4\times (n-2)$ array made by the $(n-2)$ columns from the third to the $n$-th. Suppose that $P(n-2)$ is true. Then, we can choose some columns (call them $p$ columns) which satisfy our condition. Now, let us come back to the $4\times n$ array. If choosing the first column and the $p$ columns works, then we are done. If it doesn't work, then choosing the second column and the $p$ columns works. So, $P(n)$ is true. $\quad\square$


In the following, let us prove that $P(1),P(2),\cdots,P(8)$ are true.

We use the following two lemmas (the proofs are written at the end of the answer) :

Lemma 1 : If $P(n-1)$ is true and there is at least one column whose four cells are unmarked, then $P(n)$ is true.

Lemma 2 : If $P(n-2)$ is true and there is a pair of distinct integers $(s,t)$ such that at least three of $[i,s]=[i,t]\ (i=1,2,3,4)$ hold, then $P(n)$ is true.

From the lemmas,

$\ \ (\star)$ : we may suppose that there is at least one marked cell in each column and that there is no pair of distinct integers $(s,t)$ such that at least three of $[i,s]=[i,t]\ (i=1,2,3,4)$ hold.

This $(\star)$ reduces the number of cases to consider.

  • For $n=1$, we can choose the column.

  • For $n=2$, we can choose the first column.

  • For $n=3$, if choosing the first and the second columns works, then we are done. If it doesn't work, then we may suppose that $[1,1],[1,2],[2,1],[2,2],[3,1]$ are marked and $[1,3],[2,3],[3,2]$ are unmarked. Then, choosing the first column works.

  • For $n=4$, if there is a pair of distinct integers $(s,t)$ such that $[1,s]=[1,t]$ and $[2,s]=[2,t]$, then we may suppose that $[3,s]$ is marked and that $[3,t]$ is unmarked.
    case 1 : If $[3,u]=[3,v]$ where $u\not=v$ are different from $s$ and $t$, then choosing the $s$-th and the $u$-th columns works.
    case 2 : If $[3,u]$ is marked and $[3,v]$ is unmarked, then choosing the $s$-th and the $v$-th columns works.
    If there is no pair of distinct integers $(s,t)$ such that $[1,s]=[1,t]$ and $[2,s]=[2,t]$, then we may suppose that $[1,1],[1,3],[2,1],[2,4]$ are marked and $[1,2],[1,4],[2,2],[2,3]$ are unmarked. If choosing the first and the second columns works, then we are done. If it doesn't work, then from $(\star)$, choosing the first column works.

  • For $n=5$, if there are two distinct pairs of distinct integers $(s,t),(u,v)$ such that $[1,s]=[1,t],$$[2,s]=[2,t],$$[1,u]=[1,v]$ and $[2,u]=[2,v]$, then choosing the $s$-th and the $v$-th columns works.
    If there is only one pair of distinct integers $(s,t)$ such that $[1,s]=[1,t]$ and $[2,s]=[2,t]$, then we may suppose that we have the following three cases :
    case 1 : $[1,1],[1,2],[2,1],[2,2],[3,1]$ are marked and $[3,2],[1,5],[2,5]$ are unmarked. If choosing the first and the third columns works, then we are done. If it doesn't workd, then choosing the second and the third columns works.
    case 2 : $[1,1],[1,2],[1,3],[2,3],[2,4],[3,1]$ are marked and $[1,4],[1,5],[2,1],[2,2],[3,2]$ are unmarked. If choosing the first and the third columns works, then we are done. If it doesn't work, then choosing the second and the third columns works.
    case 3 : $[1,3],[1,5],[2,4],[2,5],[3,1]$ are marked and $[1,1],[1,2],[1,4],[2,1],[2,2],[2,3],[3,2]$ are unmarked. If choosing the first and the third and the fourth columns workds, then we are done. If it doesn't work, then choosing the first and the fifth columns works.

  • For $n=6$, we may suppose that there are two distinct pairs of distinct integers $(s,t),(u,v)$ such that $[1,s]=[1,t],$$[2,s]=[2,t],$$[1,u]=[1,v]$ and $[2,u]=[2,v]$. Then, we may suppose that $[3,s],[3,u]$ are marked and that $[3,t],[3,v]$ are unmarked. Now, choosing the $s$-th and the $v$-th and the one from the rest two columns works.

  • For $n=7$, we may suppose that there are three distinct pairs of distinct integers $(a,b),(c,d),(e,f)$ such that $[1,a]=[1,b],$$[2,a]=[2,b],$$[1,c]=[1,d],$$[2,c]=[2,d],$$[1,e]=[1,f]$ and $[2,e]=[2,f]$. Then, we may suppose that $[3,a],[3,c],[3,e]$ are marked and that $[3,b],[3,d],[3,f]$ are unmarked. If $[3,g]$ is marked where $g$ is different from $a,b,c,d,e$ and $f$, then choosing the $a$-th, the $c$-th and the $f$-th columns works. If $[3,g]$ is unmarked, then choosing the $a$-th , the $d$-th and the $f$-th columns works.

  • For $n=8$, we may suppose that there are four distinct pairs of distinct integers $(a,b),(c,d),(e,f),(g,h)$ such that $[1,a]=[1,b],$$[2,a]=[2,b],$$[1,c]=[1,d],$$[2,c]=[2,d],$$[1,e]=[1,f],$$[2,e]=[2,f],$$[1,g]=[1,h]$ and $[2,g]=[2,h]$. Then, we may suppose that $[3,a],[3,c],[3,e],[3,g]$ are marked and that $[3,b],[3,d],[3,f],[3,h]$ are unmarked. Now, choosing the $a$-th, the $d$-th, the $e$-th and the $h$-th columns works.


Finally, let us prove the two lemmas :

Lemma 1 : If $P(n-1)$ is true and there is at least one column whose four cells are unmarked, then $P(n)$ is true.

Proof for Lemma 1 :

Consider the $4\times (n-1)$ array excluding the column. Since $P(n-1)$ is true, we can choose suitable columns (call them $p$ columns) for the $4\times (n-1)$ array. Now, coming back to the $4\times n$ array, we see that choosing the $p$ columns works. $\quad\square$

Lemma 2 : If $P(n-2)$ is true and there is a pair of distinct integers $(s,t)$ such that at least three of $[i,s]=[i,t]\ (i=1,2,3,4)$ hold, then $P(n)$ is true.

Proof for Lemma 2 :

Consider the $4\times (n-2)$ array excluding the $s$-th and the $t$-th columns. Since $P(n-2)$ is true, we can choose suitable columns (call them $p$ columns) for the $4\times (n-2)$ array. Now, let us come back to the $4\times n$ array. If choosing the $s$-th column and the $p$ columns works, then we are done. If it doesn't work, then choosing the $t$-th column and the $p$ columns works. $\quad\square$

Related Question