Let $G$ be a family of simple, unlabeled and undirected graphs, $G(n)$ the graphs on $G$ with $n$ vertices and $G^c(n)$ the graphs of $G(n)$, which are connected. I thought about the problem of determing $|G(n)|$, when the values of $|G^c(k)|$ are known for $k = 1\dots n$.
All families I am looking at have the nice property, that the connected components of a graph $g \in G$ are also in $G$. The family of bipartite graphs for example has this property, since every connected component of a bipartite graph is bipartite.
I developed a formula for the problem described at the beginning, that i thought would be right, but implementing this formula calculates wrong numbers.
Some notation:
Let $P(n)$ be the set of (distinct) number partitions of $n$ and for a partition $p = (p_1, p_2, \dots, p_m) \in P(n)$ let $l(p) := m$ the number of "parts" of $p$. For example for the partition $p = (4,2,2,1)$ of the integer $9$
is $l(p) = 4$
The formula i developed looks like this:
$$|G(n)| = \sum_{p \in P(n)} \Bigl( \prod_{k=1}^{l(p)} |G^c(p(k))|\Bigr)$$
I came up with this formula because of some basic combinatorial arguments:
- For a given partition $p = (p_1, \dots , p_m) \in P(n)$, the expression $\prod_{k=1}^{l(p)} |G^c(p(k))|$ counts the graphs of $G(n)$, which consist of connected components of sizes $p_1, \dots , p_m$
- If a graph has connected components of sizes $p_1, \dots, p_m$ (ordered descending), then $(p_1, \dots, p_m) \in P(n)$
So if I summarize over all partitions of $n$ I should get the right answer. However, i implemented this formula for my bachelor thesis and get wrong numbers. Here is a table showing the numbers for bipartite graphs:
\begin{array}{c|c|c|c|c|c}
n & 8 & 9 & 10 & 11 & 12 \\ \hline
\text{right number} & 303 & 1119 & 5479 & 32303 & 251135 \\ \hline
\text{calculated number} & 306 & 1122 & 5495 & 32322 & 251320 \\ \hline
\text{difference} & 3 & 3 & 16 & 19 & 185
\end{array}
For $n \leq 7$, the numbers are right. Above $n=7$, the numbers calculated with the formula (second row) are higher than the right numbers (first row). In the third row i put the difference between the two.
My question is:
Did I miss some combinatorial argument while developing the formula? Are there for example some edge cases i am missing so that some graphs get counted twice with my formula?
Best Answer
The problem shows us up in cases where you have
For the family of bipartite graphs, this needs $n \ge 8$ to happen, because only at $s=4$ do we have more than one connected bipartite graph of order $s$. They are $C_4$ (the cycle with $4$ vertices), $P_4$ (the path with $4$ vertices), and $K_{1,3}$ (the star with $4$ vertices).
When you're counting graphs with two components of order $4$, a graph with two different components is double-counted. For example, say $G$ is the disjoint union of $P_4$ and $C_4$. Then it's counted once when you choose $P_4$ as the first component, and $C_4$ as the second; it's counted again when you choose $C_4$ as the first component, and $P_4$ as the second.
As a result, the disjoint unions $P_4 \sqcup C_4$, $P_4 \sqcup K_{1,3}$, and $C_4 \sqcup K_{1,3}$ are the $3$ bipartite graphs of order $8$ that you've double-counted.
The other answer gives you a way to solve this problem with generating functions. If you prefer to stick to the same kind of formula you already have, you will need to update the $$\prod_{k=1}^{l(p)} |G^c(k)|$$ term in your formula. Let $n_s(p)$ denote the number of parts of order $s$ in $p$. Then the number of ways to choose the components of those orders is given by the multiset number $$ \left(\!\!\binom{|G^c(s)|}{n_s(p)}\!\!\right) = \binom{|G^c(s)|+n_s(p)-1}{n_s(p)} $$ and so we change the formula to $$ \sum_{p \in P(n)} \prod_{s \ge 1}\binom{|G^c(s)|+n_s(p)-1}{n_s(p)}. $$