[Math] How to cover a set in a grid with as few rectangles as possible

co.combinatoricscoveringdiscrete geometry

In calculus, when estimating a area of a set in a 2-dimensional space, we use rectangles to approximate. To get sufficient precision, how many rectangles are needed if the shape of the set is close to a rectangle? I formalize the discrete version of the problem as follows.

Suppose we have a $N\times N$ grid (I assume it is a $N$ rows of squares, each row contains $N$ squares), and a set, say $S$, contains at least $r N^2$ squares, $r<1$. Now we wanna cover $S$ using rectangles approximately. Here retangle is defined in this way. Pick several rows, maybe not contiguous, and several columns, maybe not contiguous either, all the squares in crossing form a square. For instance, all the black squares in chessboard consist of two disjoint rectangles.

The requirements are

1) all rectangles are disjoint with each other.

2) The number of misplaced squares (i.e. the squares outside $S$ but covered and the squares in $S$ but not covered) $\leq\epsilon |S|$, where $\epsilon$ is considered to be a small positive constant.
Question is how many rectangles are sufficient.

My guess is $poly(\frac{1}{r})$.

Best Answer

In general, I don't think you can expect a set to be well approximated by such a small number of rectangles.

Let $S$ be a random set formed by including every square with probability $1/2$. Then with high probability $S$ has $r \geq 0.5-\epsilon$ for any $\epsilon$.

Now consider any (fixed) arbitrary set $T$. The error of $T$ from $S$ can be thought of as the sum of $N^2$ Bernoulli trials, each with probability $1/2$. It follows from the Chernoff bound that for any fixed $\epsilon<1/2$ the probability of having error at most $\epsilon N^2$ is at most $c^{n^2}$ for some constant $c<1$ depending only on $\epsilon$.

On the other hand, there are only $4^N$ rectangles (choose whether or not to include each row and column), so at most $4^{Nk}$ unions of $k$ rectangles. Taking the union bound over all such unions, we see that with high probability $S$ is not approximated by any union of $o(N)$ rectangles.

In general, I have a feeling (though I'm not familiar enough with this area to say for certain) that a better way to explain this all is in terms of information theory/entropy -- Specifying that a set has density $r$ still leaves you with approximately $N^2 H(r)$ (where $H$ is the entropy function) bits of entropy. On the other hand, the union of $k$ rectangles has less than $2kN$ such bits. You can't compress the former into the latter if $k$ is much less than $NH(r)$ without incurring a fair amount of error.

Related Question