Random choice of distinctly-colored edges from edge-coloring of complete graph

combinatoricsgraph theoryprobabilitysieve-theoryupper-lower-bounds

Let $G = (V,E)$ be the complete graph on $n=2k+1$ vertices, and let $E = E_1\amalg\cdots\amalg E_n$ be a proper edge-coloring with $n$ colors. Suppose for each $1\leqslant i\leqslant n$, we choose an edge $e_i\in E_i$ uniformly at random in such a way that $e_1,…,e_n$ are jointly independent. Let $P(n)$ be the probability that $e_1,…,e_n$ spans $V$ in the sense that for all $v\in V$ there exists some $i$ with $e_i$ adjacent to $v$.

I'm interested in upper bounds for $P(n)$. Specifically I'd like to show $P(n)<1/n(n+1)$ for large $n$.

I'm not sure what the best way to attack this problem is. One thought I had was to consider the event $X_v$ that $v$ is not adjacent to any $e_i$. One can show
$$P(X_v) \approx e^{-2},$$
and I think the $X_v$ are "approximately independent" in the sense that for any subset $S\subseteq V$ with $|S| = o(\sqrt{n})$, we have

$$P\left(\bigcap_{v\in S} X_v\right) = \left(1+O\left(\frac{|S|^2}{n}\right)\right)\prod_{v\in S} P(X_v).$$

I thought perhaps applying sieve theory with this would allow us to construct a good upper bound for $P(n)$, but the best I could obtain with Selberg's upper bound sieve was $P(n) = O(1/n)$. I'm not very familiar with sieve theory though, so maybe the parameters I chose for the sieve were just inefficient.

For the most direct application to the particular problem I'm working on, the coloring is the standard coloring where $V = \mathbb{Z}/n$ and $(i,j)$ is colored by $k\in\mathbb{Z}/n$ where $i+j\equiv k\mod n$. In this case, I ran a Monte Carlo simulation with $100,000$ trials for each odd $n<50$ and obtained the following result, which suggests that $P(n)$ is exponentially decreasing with $n$ (at least for this choice of edge-coloring):

Eventually I'd like to generalize to some other situations, so I'm mostly interested in the techniques used to prove bounds like this, and as such I would prefer solutions that use as little of the specific structure as possible.

Best Answer

In each of the $n$ colors, there are $\frac{n-1}{2}$ edges. For a fixed vertex $v$, the probability it's left uncovered by $e_1, \dots, e_n$ is the probability that for each of the $n-1$ colors occurring on edges out of $v$, we don't choose the edge with an endpoint at $v$, which is $(1 - \frac{2}{n-1})^{n-1}$. As $n \to \infty$, this approaches about $e^{-2}$.

If, for all $v$, the events "$v$ is uncovered" were independent, then the probability that all vertices are covered would approach $(1 - e^{-2})^n$, which is exponentially small. Of course, the events are not independent. But they're mostly independent: knowing $v$ is uncovered doesn't significantly affect the odds that $w$ is uncovered. So we might expect that the true probability is not too far from $(1-e^{-2})^n$.

In general, proving "these events are independent enough" is hard. But if all we want is a bound of $\frac1{n(n+1)}$, then there is enough of a gap between what we want and what ought to be true that we can give a lot away in the service of making our life easier. One way to do this is with a coupling argument. That is, we describe a larger probability space, within which we have:

  • The original random process, where edges $\{e_1, \dots, e_n\}$ are sampled with the correct distribution;
  • A different, easier-to-understand random process with more independence;
  • Some relationship between the two processes.

Let $S$ be an arbitrary subset of $\sqrt n$ out of the $n$ vertices. To describe the coupling, we give a random alorithm that generates the set $\{e_1, \dots, e_n\}$, and simultaneously assigns a label to each vertex of $S$ that's an upper bound on the number of edges from $e_1, \dots, e_n$ incident on that vertex.

Initially, all labels start at $0$. For each color class $E_i$, we will go through the edges of $E_i$ one at a time in some order that deals with the edges with an endpoint in $S$ first. When we get to the $j^{\text{th}}$ edge in $E_i$ (call that edge $vw$), we do the following:

  1. Let $p=\frac1{k+1-j}$ if we have not selected $e_i$ yet, and let $p=0$ otherwise. This is the probability with which we want to set $e_i = vw$.
  2. If both $v$ and $w$ in $S$, then for each one, add $1$ to its label with probability $\sqrt{\frac 3n}$. If both of their labels increase by $1$ (which happens with probability $\frac 3n$), then with probability $\frac{p}{3/n}$, let $e_i = vw$. (Note that $p = \frac1{k+1-j}$ is at most $\frac1{k+1-\sqrt n} = \frac2{n+1-2\sqrt n} < \frac 3n$ for $n>30$ or so, so this is a valid probability.)
  3. If just one of $v$ or $w$ (say, $v$) is in $S$, then add $1$ to $v$'s tally with probability $\frac 3n$. If we do, then with probability $\frac{p}{3/n}$, let $e_i = vw$.
  4. If neither $v$ nor $w$ is in $S$, then just set $e_i = vw$ with probability $p$.

For each $v \in S$, the label on $v$ is at least the number of edges in $e_1, \dots, e_n$ incident on $v$, because whenever we select such an edge, we also increase the label of $v$. Also, the labels on the vertices in $S$ are independent, and with a relatively nice distribution: each label has $\sqrt n-1$ occasions on which it can increase with probability $\sqrt{\frac 3n}$, and $n-\sqrt n$ occasions on which it can increase with probability $\frac 3n$.

With probability $(1 - \sqrt{\frac 3n})^{\sqrt n -1} (1 - \frac 3n)^{n-\sqrt n}$, a vertex $v \in S$ never has its label increase. As $n \to \infty$, this probability tends to the rather awkward quantity $e^{-3-\sqrt 3} \approx 0.0088$. The labels on the vertices of $S$ are independent, so the probability that all vertices in $S$ have positive label is $(1 - e^{-3-\sqrt 3})^{\sqrt n}$.

This function is approximately $0.99^{\sqrt n}$, which is not quite the exponential in $n$ we wanted, but it is going to eventually go to $0$ much faster than $\frac1{n(n+1)}$. And if a vertex in $S$ has label $0$, then it is certainly not covered by the edges $e_1, \dots, e_n$, so we obtain the result we wanted.