[Tex/LaTex] Is it still worthwhile to let TeX try line-breaking without hyphenation

hyphenationline-breakingprofilingtex-core

Here's TeX's line-breaking approach (as I understand it) in a nutshell:

  1. If \pretolerance is positive, try to break a paragraph into lines without inserting discretionary hyphens and without exceeding a badness of \pretolerance.

  2. If method 1 fails, allow hyphenation and try not to exceed a badness of \tolerance.

  3. If method 2 fails and \emergencystretch is positive, try again with the amount of "tolerable" white space per text line increased by \emergencystretch.

On p. 96 of the TeXbook, Knuth reports experiments showing that "the first pass [without hyphenation] succeeds more than 90% of the time" for "fairly wide" lines, but fails quickly for "very narrow" ones. He also states that the first pass is done "[i]n order to save time". My interpretation of this is as follows:

  • For cases where line-breaking without hyphenation fails, one would in fact save time by omitting the first pass (i.e., setting \pretolerance=-1).

  • It is also possible (though maybe not very likely for languages with a small average word length) that the first pass will succeed, but that allowing hyphenation would have resulted in a solution with smaller badness.

  • As Knuth nevertheless chose a default value of 100 for \pretolerance, he must have regarded the net time savings from "trying without hyphenation" as worthwile, given the average processing power of the time when he adopted these settings.

I don't know if the default settings for TeX's line-breaking algorithm changed over time. But isn't it possible that formerly substantial net time savings are irrelevant today?

So: Is it still worthwhile to let TeX try line-breaking without hyphenation? Or is it preferable by now to adopt \pretolerance=-1 as default setting?

Best Answer

It does not have any serious impact on performance on modern machines and I can vouch on old machines as well. Depending on your settings more than 50% of text would normally pass through the first pass. Here is a figure of two tests (the red numbers denote badness):

enter image description here

The tests were carried out using code posted by Wilson on Git. Personally I would recommend let the \pretolerance stay at 100 it will probably be faster (as you do not force the other passes in the majority of cases).

\documentclass{article}
\usepackage{xcolor}
%%% Code from GIT posted by Wilson

\frenchspacing
\fussy

\makeatletter
\newbox\trialbox
\newbox\linebox
\newcount\maxbad
\newcount\linebad
\newcount\bestbad
\newcount\worstbad
\newcount\overfulls
\newcount\currenthbadness

\def\trypar#1\par{%
  \showtrybox{\linewidth}{#1\par}%
}

\newcommand\showtrybox[2]{%
  \currenthbadness=\hbadness
  \maxbad=0\relax
  \setbox\trialbox=\vbox{%
    \hsize#1\relax#2%
    \hbadness=10000000\relax
    \eatlines
  }%
  \hbadness=10000000\relax
  \setbox\trialbox=\vbox{%
    \hsize#1\relax#2%
    \printlines
  }%
  \noindent\usebox\trialbox\par
  \hbadness=\currenthbadness
}

\newcommand\trybox[2]{%
  \currenthbadness=\hbadness
  \maxbad=0\relax
  \setbox\trialbox=\vbox{%
    \hsize#1\relax#2\par
    \hbadness=10000000\relax
    \eatlines
  }%
  \hbadness=\currenthbadness
}

\def\eatlines{%
  \begingroup
  \setbox\linebox=\lastbox
  \setbox0=\hbox to \hsize{\unhcopy\linebox\hss}%
  \linebad=\the\badness\relax
  \ifnum\linebad>\maxbad\relax \global\maxbad=\linebad\relax \fi
  \ifvoid\linebox\else
    \unskip\unpenalty\eatlines
  \fi
  \endgroup
}

\def\printlines{%
  \begingroup
  \setbox\linebox=\lastbox
  \setbox0=\hbox to \hsize{\unhcopy\linebox}%
  \linebad=\the\badness\relax
  \ifvoid\linebox\else
    \unskip\unpenalty\printlines
    \ifhmode\newline\fi\noindent\box\linebox\showbadness
  \fi
  \endgroup
}

\def\showbadness{%
  \makebox[0pt][l]{%
    \ifnum\currenthbadness<\linebad\relax
      \ifnum\linebad=1000000\relax\expandafter\@gobble\fi
      {\quad\color{red}\rule{\overfullrule}{\overfullrule}~{\footnotesize\sffamily(\the\linebad)}}%
    \fi
  }%
}

\makeatother

\begin{document}

\hbadness=-1 
\begin{minipage}[t]{4.5cm}
\trypar\hyphenpenalty=500\looseness=1
In olden times when wishing
still helped one, there lived a
king whose daughters were all
beautiful, but the youngest was so
beautiful that the sun itself,
which has seen so much, was
astonished whenever it shone in
her face. Close by the king's
castle lay a great dark forest,
and under an old lime-tree in the
forest was a well, and when
the day was very warm, the
king's child went out into the 
forest and sat down by the side
of the cool fountain, and when she was bored she
took a golden ball, and threw it up on a high and caught it, and this
ball was her favorite plaything. \par
\end{minipage}
\hspace{2cm}
\begin{minipage}[t]{4.5cm}
\trypar\hyphenpenalty=10000\looseness=1
In olden times when wishing
still helped one, there lived a
king whose daughters were all
beautiful, but the youngest was so
beautiful that the sun itself,
which has seen so much, was
astonished whenever it shone in
her face. Close by the king's
castle lay a great dark forest,
and under an old lime-tree in the
forest was a well, and when
the day was very warm, the
king's child went out into the 
forest and sat down by the side
of the cool fountain, and when she was bored she
took a golden ball, and threw it up on a high and caught it, and this
ball was her favorite plaything. \par
\end{minipage}
\end{document}   
Related Question