[Tex/LaTex] An explanation of LaTeX’s output routine

documentationlatex-baseoutput-routine

Is there a good reference for LaTeX's output routine? The documented source is confused and confusing. The authors seem confused as to why parts are the way they are and wonder if maybe things should be changed:

Not sure about these: two questions. Should things which must apply to a whole
doument be local or global (they probably should be ‘preamble only’ commands)?
Are these three such things?

My favorite quotes so far are the following gems.

This is a very much an emergency action, just dumping everything: footnotes first
then floats. A more sophisticated version is needed; but even more urgent is a
bug-free version (see, for example, pr/3528).

and

We empty any left over kludge insert box here; this is a temporary fix. It should
perhaps be applied to one page of cleared floats, but who cares? The whole of this
stuff needs completely redoing for many such reasons.

I suspect that tex.stackexchange is not the right place for explaining what the entire output routine is doing, but I'd appreciate any pointers to a clear explanation. I'm especially interested in why the float mechanism invokes the output routine (with large negative penalties), sometimes multiple times; how pages of floats are processed; what these kludge insert boxes are; and what hooks class authors can use.

Best Answer

The output routine is called either by TeX's normal page-breaking mechanism, or by a macro putting a penalty of < or = -10000 in the output list. These large penalties communicate with the OTR. For example a penalty of -10001 is a clearpage, whereas a -10004 is a float insertion etc.

Information on LaTeX output routine is very hard to find - and guessing from the comments in LaTeX's source, it is also hard to follow even for the LaTeX team!

The output routine is one of the more mysterious pieces of TeX. The chapter of the TeXbook discussing output routines claims that designing output routines makes one achieve the level of a Tex Grandmaster.

As is so often the case, mastery of the concept of an output routine in plain TEX will only barely prepare you for the complexities awaiting you with LaTeX’s variant of an output routine. However, it is better to start by studying TeX's OTR first. Luckily, there is some help in a series of articles by that great TeX exegete David Salomon. They are all available online as TUGBoat articles.

Output Routines: Examples and Techniques. Part I: Introduction and Examples.

Output Routines: Examples and Techniques. Part II: OTR Techniques

Output Routines: Examples and Techniques. Part III: Insertions

Output routines: Examples and techniques Part IV: Horizontal techniques

Read the last part first!

For LaTeX you can read Frank Mittelbach's, published paper xo-pfloat.pdf in which he explains some of the problems facing the team, when dealing with the output routine. Reading it you will appreciate that floats is still one of the hard Computer Science problems and feel a bit of sympathy for Microsoft trying to do it interactively for multi-page documents!

There is also an article by David Kastrup Output Routine Requirements for Advanced Typesetting Tasks (Proceedings of EuroTEX 2003) outlining some of the difficult areas and specifications for generic routines

This would give you a bit of background to start deciphering source2e itself. It is not all that hard, but one needs to get a good grounding at the standard building blocks such as insertions lists, here points etc.

In a nutshell all floats are put in boxes and then lists and unboxed by the algorithm. sometimes mind-boggling lists such as this.

\gdef\@freelist{\@elt\bx@A\@elt\bx@B\@elt\bx@C\@elt\bx@D\@elt\bx@E
                 \@elt\bx@F\@elt\bx@G\@elt\bx@H\@elt\bx@I\@elt\bx@J
                 \@elt\bx@K\@elt\bx@L\@elt\bx@M\@elt\bx@N
                 \@elt\bx@O\@elt\bx@P\@elt\bx@Q\@elt\bx@R}

These boxes are sometimes not enough and great insight can be obtained by reading the documentation of related packages such as morefloats, which simply adds more boxes to the above list using 100's of expandafters!

There are some useful macros in the code - one the way of using @elt lists (they just equivalent to Knuth's double slashes). @elt is just a Lisp relic and is an abbreviation for element). Also look up @bitor, xbitor etc..

Looking for hooks? Perhaps you can use the AtBeginDvi in a similar way that bobhook used it to add water-marks to a page.

Lastly, just to touch on the kludgeins. Depending on one's interpretation they can either be an ill-assorted collection of poorly-matching parts, forming a distressing whole or the more German witty or smart, I go for the latter! My favourite quote from the source!

The star form of this command is dedicated to Leslie Lamport, the other we need for ourselves (FMi, CAR).

Great Team with a good sense of humour! Can't wait to hear from the other members here of the LaTeX3 way!

Related Question