[Tex/LaTex] What research papers exist about TeX and friends

big-listlatex-miscresearchtex-core

Some time ago there was a question on this site Are there any open research problems in the world of TeX?

In a similar spirit, I'd like to ask, whether there are any research articles out there in the peer-reviewed scientific literature which deal with TeX/LaTeX. This could be on different levels

  • TeX as a typesetting engine, i.e. low-level development. There is for example the work by Hàn Thế Thành, who implemented pdfTeX and the microtypographical extensions: PhD Thesis (PDF).

  • TeX as a macro language. Programming TeX macros is quite different from other programming languages. I am not (yet) aware of any research in this direction.

Searching Google Scholar for “LaTeX typesetting” yields mostly introductory books about LaTeX. What are the correct keywords to look for or which journals do I have to scan?

Best Answer

Line-breaking

The classic paper about TeX, what (IIRC) DEK once called the main research output produced by the TeX project, is the paper:

[Reprinted in Digital Typography (1999), with some updates to notation etc. If you get an old printing of the book make sure to read the errata; there were unfortunately some significant typos.]

IMO even if you never use TeX or LaTeX, you ought to read this paper. It's a tour de force. In 66 pages, it introduces the line-breaking problem, formulates it mathematically, defines desirability criteria (badness, etc.), compares approaches, shows the power of this model (lots of sophisticated examples), then goes into the TeX algorithm with all its “bells and whistles”, before concluding with an inspiring history.

At heart, this paper is basically about TeX's elegant and powerful boxes-and-glue (or box-glue-penalty) model. By specifying an appropriate sequence of boxes, glue and penalties (to the same algorithm), we can elegantly solve many typesetting problems: not just for typical text (fully justified paragraphs) but things like centered text, hanging indentation, ragged-right text, typesetting program source code, book indices, quotations and author lines and blocks that should look different at different widths, paragraphs in various “shapes”, etc. After showing off these solutions, there is some theory developed (a kind of "algebra") that helps you come up with similar constructions yourself.

Many of these problems are also given in The TeXbook as double dangerous-bend exercises and in Appendix D: Dirty Tricks, but in the paper it's more expository. And most of these problems arose from real life, out of practical needs:

We wish to thank Barbara Beeton of the American Mathematical Society for numerous discussions about ‘real world’ applications;

An abridged version of this paper also exists:

  • “Choosing better line breaks” (1982) by Michael F. Plass and Donald E. Knuth, in Document Preparation Systems, Nievergelt et al., eds. (Amsterdam: North-Holland, 1982), 221–242.

This paper, harder to find, has a lot of overlap with the previous one, except for some new terminology introduced (it uses boxes, glue and “kerfs” instead).