[Tex/LaTex] An explanation of LaTeX’s output routine

documentationlatex-baseoutput-routine

Is there a good reference for LaTeX's output routine? The documented source is confused and confusing. The authors seem confused as to why parts are the way they are and wonder if maybe things should be changed:

Not sure about these: two questions. Should things which must apply to a whole
doument be local or global (they probably should be ‘preamble only’ commands)?
Are these three such things?

My favorite quotes so far are the following gems.

This is a very much an emergency action, just dumping everything: footnotes first
then floats. A more sophisticated version is needed; but even more urgent is a
bug-free version (see, for example, pr/3528).

and

We empty any left over kludge insert box here; this is a temporary fix. It should
perhaps be applied to one page of cleared floats, but who cares? The whole of this
stuff needs completely redoing for many such reasons.

I suspect that tex.stackexchange is not the right place for explaining what the entire output routine is doing, but I'd appreciate any pointers to a clear explanation. I'm especially interested in why the float mechanism invokes the output routine (with large negative penalties), sometimes multiple times; how pages of floats are processed; what these kludge insert boxes are; and what hooks class authors can use.

Best Answer

The output routine is called either by TeX's normal page-breaking mechanism, or by a macro putting a penalty of < or = -10000 in the output list. These large penalties communicate with the OTR. For example a penalty of -10001 is a clearpage, whereas a -10004 is a float insertion etc.

Information on LaTeX output routine is very hard to find - and guessing from the comments in LaTeX's source, it is also hard to follow even for the LaTeX team!

The output routine is one of the more mysterious pieces of TeX. The chapter of the TeXbook discussing output routines claims that designing output routines makes one achieve the level of a Tex Grandmaster.

As is so often the case, mastery of the concept of an output routine in plain TEX will only barely prepare you for the complexities awaiting you with LaTeX’s variant of an output routine. However, it is better to start by studying TeX's OTR first. Luckily, there is some help in a series of articles by that great TeX exegete David Salomon. They are all available online as TUGBoat articles.

Output Routines: Examples and Techniques. Part I: Introduction and Examples.

Output Routines: Examples and Techniques. Part II: OTR Techniques

Output Routines: Examples and Techniques. Part III: Insertions

Output routines: Examples and techniques Part IV: Horizontal techniques

Read the last part first!

For LaTeX you can read Frank Mittelbach's, published paper xo-pfloat.pdf in which he explains some of the problems facing the team, when dealing with the output routine. Reading it you will appreciate that floats is still one of the hard Computer Science problems and feel a bit of sympathy for Microsoft trying to do it interactively for multi-page documents!

There is also an article by David Kastrup Output Routine Requirements for Advanced Typesetting Tasks (Proceedings of EuroTEX 2003) outlining some of the difficult areas and specifications for generic routines

This would give you a bit of background to start deciphering source2e itself. It is not all that hard, but one needs to get a good grounding at the standard building blocks such as insertions lists, here points etc.

In a nutshell all floats are put in boxes and then lists and unboxed by the algorithm. sometimes mind-boggling lists such as this.

\gdef\@freelist{\@elt\bx@A\@elt\bx@B\@elt\bx@C\@elt\bx@D\@elt\bx@E
                 \@elt\bx@F\@elt\bx@G\@elt\bx@H\@elt\bx@I\@elt\bx@J
                 \@elt\bx@K\@elt\bx@L\@elt\bx@M\@elt\bx@N
                 \@elt\bx@O\@elt\bx@P\@elt\bx@Q\@elt\bx@R}

These boxes are sometimes not enough and great insight can be obtained by reading the documentation of related packages such as morefloats, which simply adds more boxes to the above list using 100's of expandafters!

There are some useful macros in the code - one the way of using @elt lists (they just equivalent to Knuth's double slashes). @elt is just a Lisp relic and is an abbreviation for element). Also look up @bitor, xbitor etc..

Looking for hooks? Perhaps you can use the AtBeginDvi in a similar way that bobhook used it to add water-marks to a page.

Lastly, just to touch on the kludgeins. Depending on one's interpretation they can either be an ill-assorted collection of poorly-matching parts, forming a distressing whole or the more German witty or smart, I go for the latter! My favourite quote from the source!

The star form of this command is dedicated to Leslie Lamport, the other we need for ourselves (FMi, CAR).

Great Team with a good sense of humour! Can't wait to hear from the other members here of the LaTeX3 way!

Related Solutions

[Tex/LaTex] Unbalanced output routine

When one says

\toks0={\plainoutput}\showthe\toks0

TeX answers

> \plainoutput .

If one says after this

\output=\toks0

then \showthe\output gives

> \plainoutput .

If one then says \hbox{}\penalty-10000, then the error message

! Missing { inserted.
<to be read again> 
                   \shipout 
\plainoutput ->\shipout 
                        \vbox {\makeheadline \pagebody \makefootline }\advan...
<*> \hbox{}\penalty-10000

is issued. Now TeX is in internal vertical mode, as \end produces

`! You can't use `\end' in internal vertical mode.`

but typing } doesn't do any good, as we fall in the black hole when TeX doesn't interpret any more token.

It's interesting that TeX adds braces when the assignment to \output is of the form

\output=<general text>

but not when it's like \output=<token register>.

The relevant code for the addition of braces is in module 1226, where Knuth comments "For safety’s sake, we place an enclosing pair of braces around an \output list.", and module 1227.

An interesting module to examine is 1100, which ends with output_group: followed by what is in module 1026.

I realize this is not a full answer: it's only to show that it's better not to monkey with the closing brace. TeX is in a very particular type of group, when it performs the output routine and disturbing it during this task reveals to be quite dangerous.

I've also found a discussion on comp.text.tex that seems to have points in common with this problem.

[Tex/LaTex] Background baseline grid with output routine

Some corrections are necessary (you're forgetting \topskip)

\newbox\gridbox
\setbox\gridbox\line{%
  \special{color push rgb .8 .8 1}%
  \vrule height\baselineskip width0pt \hrulefill
  \special{color pop}}
\def\grid{\vtop to0pt{\hrule height0pt\kern-\dimexpr\baselineskip-\topskip\relax
    \vbox to\dimexpr\vsize+2pt\relax{\leaders\copy\gridbox\vfil}\vss}}
\def\pagebody{\vbox to\vsize{\boxmaxdepth=\maxdepth \grid\pagecontents}}

\parskip=0pt \vsize=\dimexpr\topskip+44\baselineskip\relax % 45 lines per page

\def\1{As any dedicated reader can clearly see, the Ideal of
practical reason is a representation of, as far as I know, the things
in themselves; as I have shown elsewhere, the phenomena should only be
used as a canon for our understanding. The paralogisms of practical
reason are what first give rise to the architectonic of practical
reason. As will easily be shown in the next section, reason would
thereby be made to contradict, in view of these considerations, the
Ideal of practical reason, yet the manifold depends on the phenomena.
Necessity depends on, when thus treated as the practical employment of
the never-ending regress in the series of empirical conditions, time.
Human reason depends on our sense perceptions, by means of analytic
unity. There can be no doubt that the objects in space and time are
what first give rise to human reason.

Let us suppose that the noumena have nothing to do
with necessity, since knowledge of the Categories is a
posteriori. Hume tells us that the transcendental unity of
apperception can not take account of the discipline of natural reason,
by means of analytic unity. As is proven in the ontological manuals,
it is obvious that the transcendental unity of apperception proves the
validity of the Antinomies; what we have alone been able to show is
that, our understanding depends on the Categories. It remains a
mystery why the Ideal stands in need of reason. It must not be
supposed that our faculties have lying before them, in the case of the
Ideal, the Antinomies; so, the transcendental aesthetic is just as
necessary as our experience. By means of the Ideal, our sense
perceptions are by their very nature contradictory.

}

\1\1\1\1\1\1\1\1\1

\bye

EDIT

A different definition of \grid that better shows what happens:

\def\grid{%
  \vtop to 0pt{
    \hrule height 0pt % set the reference point
    \special{color push rgb .8 .8 1}
    \vbox to \vsize{
      \vskip\topskip\vskip-0.4pt % backup because of the first rule
      \hrule
      \cleaders\vbox to\baselineskip{\vfill\hrule width \hsize}\vfill
    }
    \special{color pop}
    \vss % we want to have zero depth
  }%
}

Best Answer

Related Solutions

[Tex/LaTex] Unbalanced output routine

[Tex/LaTex] Background baseline grid with output routine

EDIT

Related Question