Drawing partial trees of different sizes side-by-side

forestgraphsqtreetikz-treestrees

There exists an elaborate thread about how to draw rooted trees in LaTeX, e.g. for use in natural language applications. There exist packages other than TikZ to do this, like qtree or forest. However, a common theme in all these trees is that the leaves eventually join together in one overarching root. There are applications where this isn't the case; one example is hierarchical segmentation like byte-pair encoding. Another could be cutting across an agglomerative clustering hierarchy.

The net result is that you get multiple trees, but the leaves of one tree must be aligned and the leaves of different trees must also be aligned. Here is a hypothetical example I drew in Paint for aggregating the letters of discombobulate, ending at a segmentation discom+bobulate:

How would one reproduce this in LaTeX, given a list of merges? There are several freedoms:

You are allowed to choose any specification format for the merges/tree(s). The list of merges comes from a Python program anyway, so it doesn't take much to adapt to the format you think is easiest. I have a programmatic representation of the tree, so it's trivial to convert it to something like Newick format which is almost literal forest code.
You are free to choose the the height at which a non-leaf node is placed, as long as it is 1. discretised and 2. above its constituents.
If you prefer right-angled branches like in a dendrogram, that is fine too.

There are several requirements I would like a solution to have, to make it sufficiently general:

Should allow more than two constituents for one node (e.g. "d", "i", "s" merge together immediately into "dis" without passing through "is"). Hence, please don't use a package that only allows drawing binary trees.
Should have some control over horizontal and vertical compactness, i.e. how closely packed the layers are and how far the leaves are from each other.
Should allow turning off intermediate node names.

The latter would produce something like the following image:

Note to editors: a better title to this question is always appreciated. I don't like the title I came up with.

Best Answer

By use of forest package:

with forked edge:

\documentclass[margin=3mm, varwidth]{standalone}
\usepackage[edges]{forest}

\begin{document}
    \begin{figure}[ht]
\forestset{
    LT/.style = {% Linguistic tree
delay={where content={}{shape=coordinate}{}},
where n children=0{tier=word, baseline}{},
    for tree={
   text height = 2ex,
   text depth  = 0.5ex,
    inner ysep = 0pt,
    inner xsep = 1pt,
        forked edge,    % for forked edge
         s sep = 1mm,   % sibling distance
          }}}
    
\begin{forest}  LT
[discom
    [
        [d]
            [
                [i]
                [s]
            ]
    ]
    [
        [c]
            [
                [o]
                [m]
            ]
    ]
]
\end{forest}
\quad
\begin{forest}  LT
[bubolate
    [
        [b]
            [
                [o]
                [b]
            ]
    ]
    [
        [
            [
                [u]
                [l]
            ]
        ]
        [
            [
                [a]
                [t]
            ]
            [e]
        ]
    ]
]
\end{forest}
    \end{figure}
\end{document}

as linguistic tree:

\documentclass[margin=3mm, varwidth]{standalone}
\usepackage[linguistics]{forest}

\begin{document}
    \begin{figure}[ht]
\forestset{
    LT/.style = {% Linguistic tree
delay={where content={}{shape=coordinate}{}},
where n children=0{tier=word, baseline}{},
    for tree={
   text height = 2ex,
   text depth  = 0.5ex,
    inner ysep = 0pt,
    inner xsep = 1pt,
         s sep = 1mm,   % sibling distance
          }}}
    
\begin{forest}  LT
[discom
    [
        [d]
            [
                [i]
                [s]
            ]
    ]
    [
        [c]
            [
                [o]
                [m]
            ]
    ]
]
\end{forest}
\quad
\begin{forest}  LT
[bubolate
    [
        [b]
            [
                [o]
                [b]
            ]
    ]
    [
        [
            [
                [u]
                [l]
            ]
        ]
        [
            [
                [a]
                [t]
            ]
            [e]
        ]
    ]
]
\end{forest}
    \end{figure}
\end{document}

Addendum:
In the case, when you like to have the same distance between letters at bottom of trees and between trees, you need first to define new command for this distance, for example

\tikz\pgfmathsetlength{\SD}{2mm}

and than replace

quad with \hskip \SD
s sep = ...˙with s sep = \SD + 1mm`

MWE:

\documentclass[margin=3mm, varwidth]{standalone}
\usepackage[linguistics]{forest}

\begin{document}
    \begin{figure}[ht]

\newcommand\SD{1 mm}        % <-------   
\forestset{
    LT/.style = {% Linguistic tree
delay={where content={}{shape=coordinate}{}},
where n children=0{tier=word, baseline}{},
    for tree={
   text height = 2ex,
   text depth  = 0.5ex, 
        draw,   % that distance are more evident/visible, remove in real document
    inner ysep = 0pt,
    inner xsep = 1pt,
         s sep = \SD + 1mm, % <------- 
          }}}
    
\begin{forest}  LT,
[discom
    [
        [d]
            [
                [i]
                [s]
            ]
    ]
    [
        [c]
            [
                [o]
                [m]
            ]
    ]
]
\end{forest}
\hskip \SD                  % <------- 
\begin{forest}  LT
[bubolate
    [
        [b]
            [
                [o]
                [b]
            ]
    ]
    [
        [
            [
                [u]
                [l]
            ]
        ]
        [
            [
                [a]
                [t]
            ]
            [e]
        ]
    ]
]
\end{forest}
    \end{figure}
\end{document}

For better visibiity

Related Solutions

[Tex/LaTex] Using tikz overlay / remember picture option with forest trees

As @cfr suggested in the comment section, this can be done by making use of the tikzmark library.

You need to add \usetikzlibrary{tikzmark} to your preamble. Then, you can put a \subnode in each forest environment and then add a tikzpicture after that with the options overlay and remember picture.

When you issue the \draw command, you can prefix the names of the subnodes with pic cs:, if you want to treat that subnode as a coordinate point (otherwise don't use the pic cs: prefix). Note that you will also need to compile the document twice.

I've also moved the starting point and ending point of the arrow up and over a bit by using the ($()+()$) syntax for specifying a coordinate position.

\documentclass{beamer}
\usepackage{tikz}
\usetikzlibrary{tikzmark}
\newcommand{\false}{\mathbf{f}}
\usetikzlibrary{calc, positioning}
\usetikzlibrary{arrows}
\usepackage{forest}
\setbeamercovered{transparent}
\begin{document}

\begin{frame}{}  

% Tree 1
\begin{forest}
    [$\sigma$, calign=last
        [$a$]
        [\subnode{replaceNode}{$\langle\sigma, \{1,2\},\false \rangle$}, calign=first, baseline, draw=blue, ellipse
            [$\sigma_*$]
            [$a$]
        ]
    ]
  \node at (current bounding box.east)
        [anchor=west]
        {$\Rightarrow_G$};
\end{forest}
%
% Tree 2
\begin{forest}
    [$\sigma$, calign=last
        [$a$]
        [$\sigma$, calign=last
            [$b$]
            [\subnode{t1}{$\langle\sigma, \{1,2\},\false \rangle$}, calign=first, baseline
                [$\sigma$, calign=first
                    [$\sigma_*$]
                    [$a$]
                ]
                [$b$]
            ]
        ]
    ]
\end{forest}
%
% This arrow should connect two nodes of both trees.
\begin{tikzpicture}[overlay,remember picture]
\draw[->, black, dashed, bend angle=45, bend left] ($(pic cs:replaceNode)+(0.25,.2)$) to ($(pic cs:t1)+(0.25,.2)$);
\end{tikzpicture}

\end{frame}

\end{document}

enter image description here

[Tex/LaTex] Drawing binary trees with LaTeX labels

The most recent release of PGF has a number of graph drawing algorithms (requiring lualatex) including a version of the Reingold–Tilford method and can easily handle large numbers of nodes.

In the simplest case a tree can be specified like this:

\documentclass[tikz,border=5]{standalone}
\usetikzlibrary{graphs,graphdrawing,arrows.meta}
\usegdlibrary{trees}
\begin{document}
\begin{tikzpicture}[>=Stealth]
\graph[binary tree layout]{
  a -> {   
    b -> { 
      c -> { 
        d -> { e, f }, 
        g 
      }, 
    h -> { i, j }
    },
    k -> {
      l -> {
        m -> { n, o },
        p -> { q, r }
      }, 
      s -> {
        v -> {w, x},
        y -> {z}
      }
    }
  }
};
\end{tikzpicture}
\end{document}

enter image description here

It is also possible to create "graph macros" which mean the graph specification can be created more-or-less automatically, even using lua:

\documentclass[tikz,border=5]{standalone}
\usetikzlibrary{graphs,graphdrawing,graphs.standard,arrows.meta}
\usegdlibrary{trees}
\begin{document}
\tikzgraphsset{%
  levels/.store in=\tikzgraphlevel,
  levels=1,
  declare={full_binary_tree}{[
    /utils/exec={
      \edef\treenodes{%
\directlua{%
  function treenodes(l)
    if l == 0 then
      return "/"
    else
      return "/ [layer distance=" .. l*10 .. "]-- {" .. treenodes(l-1) .. ", " .. treenodes(l-1) .. "}"
    end
  end
  tex.print(treenodes(\tikzgraphlevel) .. ";")
}%
      }
    },
    parse/.expand once=\treenodes 
  ]}
}
\begin{tikzpicture}
\graph[binary tree layout, grow=down, sibling distance=5pt, significant sep=0pt, nodes={fill=red, draw=none, circle, inner sep=2.5pt, outer sep=0pt}]{
   full_binary_tree [levels=7];
};
\end{tikzpicture}
\end{document}

enter image description here

Best Answer

Related Solutions

[Tex/LaTex] Using tikz overlay / remember picture option with forest trees

[Tex/LaTex] Drawing binary trees with LaTeX labels

Related Question