Understanding TREE(n) in Friedman’s Finite Kruskal’s Tree Theorem

definitiongraph theoryproof-theoryramsey-theorytrees

I was reading the Wikipedia article on Friedman's finite form of Kruskal's tree theorem, and am interested in the large numbers TREE(n). I would like to verify TREE(2)=3 myself, but find conflicting definitions of the conditions on the trees/embeddings on various websites. What properties do the trees have, and what do the relevant embeddings have to satisfy?

Edit:

For instance, a search for information led to this old posting (hereafter referred to as "that posting") which links to two pages claiming that TREE(2)=2, and suggests that those are a mistake. The post there uses the words "sequence of vertex-labelled rooted trees", which gives a hint at the meaning of TREE(n). The post also contains a potential example which may demonstrate that TREE(2)≥3.

On the other hand, Wikipedia says "…labelled trees with no order among siblings, parameterising on the size of the set of labels rather than on the size of the first tree in the sequence (and the homeomorphic embedding, ≤, now being inf- and label-preserving): For every n, there is an m so large that if T1,…,Tm is a finite sequence of trees with vertices labelled from a set of n labels, where each Ti has at most i vertices, then Ti ≤ Tj for some i<j."
This says labelled trees but makes no mention of them being rooted (except perhaps implicitly via "inf-preserving"; I'm not sure what that part means). It also brings up the word "homeomorphic", which if I take it literally as applying to a planar embedding of the trees, seems like a very odd notion to use since parts of the tree can be contracted (possibly skipping vertices?) in a homeomorphism.

Additionally, I seem to remember finding a source that seemed to imply that the trees had to be "full", and said that that meant something like "the tree is rooted, and at every level, the number of children for each vertex is the same", but I can't find that source now.

Finally, the other thing my searches pulled up was an old stack exchange question (http://math.stackexchange.com/questions/116478/quickest-way-to-understand-kruskals-tree-theorem) in which a user named "Bruce" references a paper: Jean H. Gallier, "What's so special about Kruskal's theorem and the ordinal Γ0? A survey of some results in proof theory." Ann. Pure Appl. Logic 53 (1991), no. 3, 199-260. I will summarize what appear to be the relevant parts.

Summary begins here

In that paper, the relevant theorem appears to be due to Friedman:

Theorem 5.2

Let Σ be a finite set. For every integer k≥2, there exists some integer n≥2 so large that, for any finite sequence <t1,…tn> of trees in TΣ with |tm|≤m for all m, 1≤m≤n, there exist some integers i1,…,ik such that 1≤i1<…<ik≤n and ti1≼…≼tik.

To unpack this statement, we must first understand what |t| signifies for a tree t, what TΣ is, and perhaps most crucially, what "≼" means. The relevant definitions appear at the beginning of section 4. They define (4.1) a "tree domain" to be a special kind of set of strings of positive naturals (where numbers seem to represent which child of a node is selected at each step), and given a set Σ of labels, a Σ-tree (4.2) is a function from a tree-domain to Σ. (This labels the nodes of a tree whose structure is given by the tree domain.) TΣ is the set of all finite Σ-trees. If t is a tree, |t| is the size of its domain (the number of nodes).

Finally, the beginning of section 5 suggests that "≼" is "the embedding preorder of Definition 4.11", but that also requires more definitions. Definition 4.3 explains that for a label f, f may also be used to represent the one-node tree ""↦f. It goes on to explain that if t1,…,tk are trees, and f is a label, then f(t1,…,tk) denotes the tree with root f where the ti's are attached immediately below. (The original wording is formal; I can clarify if necessary.)

The preorder ≼ of 4.11 is then defined inductively on TΣ satisfying two conditions: 1. If s and t are trees, then s≼t⇒s≼f(…,t,…). and 2. if 0≤m≤n and there are m integers ji such that 1≤j1<…<jm≤n and s1≼tj1,…,sm≼tjm then f(s1,…,sm)≼f(t1,…,tn).

In awkward words (using "immediate subtree" to mean "an entire subtree starting at a child of the root"): 1. If a tree "fits inside" an immediate subtree of another tree, then it fits inside the whole tree. and 2. If two trees have the same root, and the immediate subtrees each fit inside different corresponding immediate subtrees of the second, then the first tree fits inside the second.

Summary ends here

Assuming those definitions/statements from that paper are correct, then it seems like the integer n that we are guaranteed exists by Thm 5.2 depends on both k and the size of the label set Σ. It seems like Gallier's "k" has been set equal to 2 on Wikipedia, Gallier's "size of Σ" has been called "n" on Wikipedia, Gallier's dummy index "m" has been called "i" on Wikipedia, and Gallier's "n" is called "m" or "TREE(n)" on Wikipedia. With this in mind, it would seem that the example at that posting really does show that TREE(2)≥3.

Then I wonder if TREE(2) could be greater than 3. However, it seems like a single node of a given label precludes that label from being used again, so that the first two trees are forced up to an ordering of the labels; using bold to denote the head, they are "2" and "1–1". The third tree cannot use the label 2, so it must either be "1–1–1" or "1–1–1" or "1–1" or "1". The first three options contain "1–1" in the relevant way, so that the examples at that posting really is best-possible.

Remaining Questions:

Are there any errors in my calculation that TREE(2)=3? What do the parts of Wikipedia's definition mean if it is indeed equivalent to the one in Gallier's paper? (Or is it not equivalent?)

Best Answer

TREE(2) = 3

re: your specific question

Your calculation of TREE(2) = 3 is correct, and is essentially the same as what led me to make the old posting that you cite. I have no explanation of why Friedman would have written (in at least two places) that TREE(2) = 2, except perhaps as some sort of oversight and/or typographical error. There is some relevant discussion on the talk page of the WP article that you mention, which also links to other articles about the TREE function. You may also want to take a look at the comments and answers to this not-very-old MO question about TREE(3).

The TREE function

re: contrasting definitions

Friedman's TREE function concerns finite trees as posets that are single-rooted, vertex-labeled, and unordered. Unordered here means there is no order among siblings (i.e., vertices having a common parent); furthermore, no order nor quasi-order is assumed for the finite set of labels -- although he uses the labels 1,2,3,...,$k$, where $k$ is the argument of TREE($k$). The preceding establishes the context for Friedman's definition of the TREE function:

TREE($k$) is the length $n$ of the longest sequence of trees $T_1,...,T_n$ with vertices labeled from {1,...,k}, where each $T_i$ has at most $i$ vertices, and no $T_i$ is inf preserving and label preserving embeddable into a later $T_j$.

Note that in addition to TREE, variations on labeled/unlabeled, ordered/unordered siblings, quasi-ordered/unordered labels, embeddings with/without gap condition, etc. etc., have given rise to a plethora of similar-but-different fast-growing functions in the literature on this subject. The function Fr in the article that you mention, by Gallier, is one of these non-equivalent variations; more precisely, the argument of Fr is the length of a chain of embeddings, rather than the size of the label alphabet as in TREE (which concerns only one embedding, not a chain).