[Math] Number of different ways to join 4 tables

combinatorics

I am reading a database textbook which says the following:

enter image description here

There is no further explanation or elaboration for how this formula was arrived at. In attempting to derive it myself, I come up with something different. For example:

For N = 2 (two tables A and B), I count 2 ways to join them:

A join B
B join A

Now for N = 3 (three tables A, B, C), I count 12 ways to join them, which agrees with the text. The way I get this is by considering that after the first join (out of the two joins that must be performed total in a 3-table situation), there will be 2 tables remaining, and in that case we've already seen there are 2 options. So we need to find out how many different options we have for this "first join" and then multiply that by 2. There are 6 "first step" options (enumerated below), and 6 x 2 = 12.

A join B
A join C
B join A
B join C
C join A
C join B

Now, for N = 4 (four tables A, B, C, D), I count 144 ways to join, which does not agree with the 120 quoted in the text. As before, my strategy is to find the number of ways to make the "first join" and then multiply that by the N = 3 option count (12), since at this point 3 tables and 2 joins will remain. I count 12 options for the "first join" which leads to 12 x 12 = 144.

A join B
A join C
A join D
B join A
B join C
B join D
C join A
C join B
C join D
D join A
D join B
D join C

I am assuming the text is correct and I am mistaken. Where is my error?

Best Answer

My first suggestion would have been to better ask this question on https://math.stackexchange.com/ , but then I just spent some time researching a possible answer. By the way, I found the original quote on Google Books, and also the referenced paper by Selinger 79.

Of course, the factorial indicates the formular describes a problem in combinatorics, but I was intrigued by the expression (2*(n-1))! as I could not figure out what the 2*(n-1) would refer to.

If we take n as the number of tables ("relations" in database theory language), then n-1 is the number of JOINs. But how should the double of the number of JOINS be relevant for the result?

As it turned out, the formula (2*n)!/n! is related to Catalan numbers (see section Quadruple factorial):

In combinatorial mathematics, the Catalan numbers form a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects.

From the application

Cn is the number of different ways n + 1 factors can be completely parenthesized (or the number of ways of associating n applications of a binary operator)

I conclude, that if we take a JOIN as binary operator, then the number of alternatives in OP is Cn * number of combinations of tables.