[Tex/LaTex] Upside-down syntax trees for linguistics with horizontal lines

forestlinguisticstikz-pgftikz-treestrees

I know that syntax analysis can be done in Latex using using TikZ, but the tree-structure that produces isn't really used in my country. Here, we mostly use straight lines instead of hierarchy trees.

Here's an example what I want to achieve. I don't know where to start from.

Is there any (more or less) easy way to do it? Any idea?

enter image description here


Edit:

Even though all solutions provided until the moment are right, it would be nice to modify the following characteristics of the different diagrams you've proposed:

  • The sentence that is being analysed ("La novela que me ha regalado mi hermana…") should be on top of the graphic and all words must be in the same line; "la" can't be immediately over "Det" and "está" shouldn't be immediately over "N/V".

  • All the syntax functions of the differents words ("OP", "S/SN", "PN/SV", "Det"…) should be centered with the respective lines they have above (or at least near to the center but without having to change manually the spacing).

  • It should be possible to modify the height of the diagram.

Best Answer

Here’s a solution that uses the excellent forest package.

\documentclass[tikz,border=5pt]{standalone}
\usepackage{forest}

% Node shape adapted from http://www.texample.net/tikz/examples/data-flow-diagram/
\makeatletter \pgfdeclareshape{myunderline}{
  \inheritsavedanchors[from=rectangle]
  \inheritanchorborder[from=rectangle]
  \foreach \from in
    {center,base,north,north east,east,south east,south,south west,west,north west}{
      \inheritanchor[from=rectangle]{\from}
  }
  \backgroundpath{
    \southwest \pgf@xa=\pgf@x \pgf@ya=\pgf@y
    \northeast \pgf@xb=\pgf@x \pgf@yb=\pgf@y
    % This can be improved by removing magic numbers
    \pgfpathmoveto{\pgfpoint{\pgf@xa}{\pgf@ya+1.75em}}
    \pgfpathlineto{\pgfpoint{\pgf@xb}{\pgf@ya+1.75em}}
 }
} \makeatother

\begin{document}
\begin{forest}
for tree={
    fit=band, % Isolates space above this node from siblings’ descendants
    no edge,
    % Uncomment the line below for the dotted edges
    % edge={dotted, semithick, gray!50, shorten <=8pt}, parent anchor=north,
    % This can be improved by reducing space between levels where edges are drawn
    inner sep=0pt, outer sep=0pt,
    l sep=0pt, s sep=6pt, text depth=0.5em, grow'=north,
    where level=0{} % No style for dummy root node
      {where n children=0
        {font=\bfseries,tier=word} % Leaves in bold on the same tier
        {font=\small,tikz={\node[draw, thick, myunderline, fit to=tree] {};}} % Non-leaves
      }
}
% This can be improved by removing the need for a parent and sibling of the actual root
[,phantom[,phantom][OP
  [S/SN
    [Det [La] ]
    [N/Sust [novela] ]
    [CN/SAdj/Prop Sub Adj
      [PV/SV,
        [\textit{nexo} [que] ]
        [CI/SN [me] ]
        [N/V [ha regalado] ]
      ]
      [S/SN
        [Det [mi] ]
        [N [hermana] ]
      ]
    ]
  ]
  [PN/SV
    [N/V [est\'a] ]
    [Attrib/SAdj
      [N [ambientada] ]
    ]
    [CCL/SPrep
      [E [en] ]
      [T/SN
        [N [Australia\rlap.] ]
      ]
    ]
  ]
]]
\end{forest}
\end{document}

And here’s another version with faint dotted edges:

You can render the same structure in a more conventional appearance just by changing options:

for tree={
    edge={dotted, semithick, gray!80, shorten <=1pt,shorten >=3pt},
    parent anchor=south, child anchor=north,
    inner sep=0pt, outer ysep=2pt,
    text depth=0.5em,
    where n children=0{font=\bfseries,tier=word}{font=\small}
}

So you can see why one might prefer using forest instead of bussproofs or semantics. Also, the forest tree syntax is much simpler, and is not “backwards” as seen in cfr’s answer.

Take a look at the forest manual for more style options.

2019 edit: fit to tree option syntax has been modified to fit to=tree

2020 edit: changing l as l=1.5em changes the vertical spacing

Related Question