$\epsilon$-regular pair of vertex sets definition

combinatoricsgraph theoryprobabilistic-method

Introducing Szemeredi's Regularity Lemma requires the definition of $\epsilon$-regularity, which I cannot get good intuition for, because the official intuition clashes with, well let me explain.

Definition: for a fixed $\epsilon>0$ we say that the pair of vertex sets $(A,B)$ (which are subsets of the two vertex sets of a bipartite graph in most situations) is $\epsilon$-regular if for any subsets $A'\subset A, B'\subset B$ with $|A'|\ge\epsilon|A|$ and $|B'|\ge\epsilon|B|$ we have $$|d(A,B)-d(A',B')|\le\epsilon$$
where $d(A,B):=\frac{\#\text{edges between }A\text{ and }B}{|A||B|}$ is called the edge density (and this last definition happens to make perfect sense).

My issue with the way $\epsilon$-regularity is defined is with the condition $|A'|\ge\epsilon|A|$ and $|B'|\ge\epsilon|B|$.

$\epsilon$-regularity is supposed to capture the randomness of a graph: taking subsets won't result in a dramatical change in edge density.

But to me it would make way more sense to have $|A'|\ge(1-\epsilon)|A|$ and $|B'|\ge(1-\epsilon)|B|$ as a condition instead. According to the original definition, the smaller the $\epsilon$ the closer the edge densities are.

I guess in my version of the definition it is easier to be $\epsilon$-regular and it is actually always implied by the original $\epsilon$-regularity if $\epsilon<1/2$, because if if the densities are close for any sets of size $\ge \epsilon |A|$ and $\ge \epsilon|B|$ they will be close for any sets that are bigger than these bounds (just by the transitive property of $\ge$.

My confusion came from the fact that being close in size will result to being close in edge density, but the original definition wants more than that property (which maybe is even true for graphs that are far from being random).

Best Answer

When $A'$ and $B'$ are close in size to $A$ and $B$, such as the condition $|A'| \ge (1-\epsilon)|A|$ and $|B'| \ge (1-\epsilon)|B|$, we expect $d(A',B')$ to be close to $d(A,B)$, no matter what the graph structure is. So your version of the definition is far too weak.

For example, when $|A'| \ge (1-\epsilon)|A|$ and $|B'| \ge (1-\epsilon)|B|$, we can write $$|A'||B'|d(A',B') \le |A||B|d(A,B) \le |A'||B'| d(A',B') + 2\epsilon |A| |B|$$ just by counting edges. The first inequality says "there are at least as many edges between $A$ and $B$ as between $A'$ and $B'$." The second inequality says "we get an upper bound on the edges between $A$ and $B$ by assuming all the edges not between $A'$ and $B'$ are there." Anyway, dividing by $|A||B|$, we get $$ (1-2\epsilon + \epsilon^2) d(A',B') \le d(A,B) \le (1+\epsilon^2) d(A',B') $$ which is a form of saying "$d(A',B')$ is close to $d(A,B)$". It follows that $|d(A',B') - d(A,B)| \le 2\epsilon$. This doesn't quite imply your definition, but it's halfway there.


We don't want the $\epsilon$-regularity condition to almost hold for any pair $(A,B)$! We want the statement "$(A,B)$ is an $\epsilon$-regular pair" to actually mean something. So we ask for as much as possible - because the regularity lemma says we can.

You can imagine a more general $(\epsilon,\delta)$-regularity that says $$|d(A,B) - d(A',B')| \le \epsilon$$ whenever $|A'| \ge \delta |A|$ and $|B'| \ge \delta |B|$. But for both $\epsilon$ and $\delta$, the smaller they are, the better, and the regularity lemma says it's possible to make both of them as small as we want. So we just set them equal to simplify the statement.

Finally, we want the regularity condition to be useful. When you start using the regularity lemma, you'll see that the actual form is what we need to find large substructures in our graph.

Related Question