This question led to a new package:
unravel
Consider the following MWE, test.tex
:
\documentclass[12pt]{article}
\begin{document}
\tracingassigns=1
\tracingmacros=1
\def\aaa{something}
\def\bbb{else \aaa, else}
\edef\ccc{third \bbb, level}
\tracingassigns=0
\tracingmacros=0
\end{document}
If you build this with pdflatex test.tex
– then in the logfile, test.log
, you get something like this (linebreaks added for legibility):
{into \tracingassigns=1}
{changing \tracingmacros=0}
{into \tracingmacros=1}
{changing \aaa=undefined}
{into \aaa=macro:->something}
{changing \bbb=undefined}
{into \bbb=macro:->else \aaa , else}
\bbb ->else \aaa , else
\aaa ->something
{changing \ccc=undefined}
{into \ccc=macro:->third else something, else, le\ETC.}
{changing \tracingassigns=1}
Now, this explains the expansion steps done by (La)Tex quite well for this short example – unfortunately, it becomes extremely hard to read (for me) once you have to deal with possibly hundreds of these expansions, some maybe dealing with typesetting procedures.
So I was thinking – it shouldn't be too extremely difficult to build an application, which would basically read the logfile line by line, and allow for "stepping" through the logfile; I'd imagine rightarrow keyboard key -> would step you forward through the log, and leftarrow key <- would step backwards; possibly, one could specify line number of the logfile as a starting point as well.
Then, the application would simply react on '^{changing
', '^{into
', and possibly '^\\(.*)->(.*)
'; and would display the line, as well as the "current" token elsewhere on screen; so at the "changing" line, the extra portion of the screen would say \aaa=undefined
; and upon "into" line, the snippet would change to \aaa=macro:->something
.
I think just this facility would make visualizing and understanding the (La)Tex expansion process much more easy (especially in "real" documents). And in fact, such an application doesn't even need a full-blown GUI – I'd imagine a ncurses
terminal application would do just as well (problems with display of long strings in limited width terminal notwithstanding).
So, I was wondering – is there any application similar to this out there?
Best Answer
EDIT: This answer led to (in fact, is) a package: the
unravel
package, on GitHub. It relies on thegtl
package, so if you want to help me by testing, you'll need to grab this as well. Both packages are written using the LaTeX3 programming language provided by theexpl3
package (l3kernel
) and some extensions inl3experimental
.I have now implemented most of TeX's primitives. The parts missing are math mode, tables (
\halign
etc.), discretionaries (including\-
), the output routine,\aftergroup
,\letterspacefont
,\pdfcopyfont
,\pdfprimitive
, and all XeTeX and LuaTeX goodies. Unavoidably also, category codes are fixed when files are opened the first time,\outer
macros will break the package, and begin-group and end-group characters other than left and right braces will cause trouble. Despite all those restrictions,\unravel{\documentclass{article}\relax}
will show you all the nitty gritty details of what TeX does when going througharticle.cls
(and before this, all the work that comes into deciding whether or not a file is worth reading). Beware: at full speed, this takes several minutes and 20000 steps.As of now, the
unravel
package only provides a very simple interface to monitor TeX's activities while it is going through the expansion and typesetting process. One can only go forward. Let us give an example of use. Put the following code in a file, sayfilename.tex
, and runpdflatex filename.tex
in a terminal.After a small welcome message,
\unravel
will wait for your input. Either go through steps one at a time, by pressing the keyenter
, or types20o1
("*s*croll 20 steps but still *o*utput") thenenter
. In the latter case, the output is (similar to, depending on the version) what follows.Of course, the effect of the
\AtEndDocument
command takes place, and there is a message "Bye!" at the end of the compilationThis example did not involve any complicated expansion. For more fun in this direction, try to understand how the
l3fp
expression parsing works...After some steps, I get the following shown on my terminal:
(I obtained this by pressing
s3333
then enter twice). What does all this mean? Well, at some point during the expansion of\fp_eval:n { sin(2pi/3) }
, TeX found\exp_after:wN
(the LaTeX3 name for\expandafter
), kept the next token,\__fp_to_decimal_dispatch:w
for an expansion later, and expanded what follows. What follows was\tex_romannumeral:D
(next token in the part of the screen marked with||
), which made TeX look for a number. After expanding some tokens further, TeX found the (incomplete) number-`0
(equal to-48
) which made it expand further, and so on. The snapshot shown on the screen is taken at a time where TeX is 24 level deep in such nested expansions (all the tokens in the||
part), and is going towards a 10th level since the last\exp_after:wN
of the||
part, after jumping over\__fp_fixed_mul_after:wn
, hits\int_use:N
(aka\the
). Needless to say, macros to perform floating point computations are a bit hard to follow.In general, there can be three parts:
|>
denote tokens on input, that TeX has not yet seen, or that have been reinserted for instance after a macro expansion;||
denote tokens that are stored for later;<|
denote commands that have reached TeX's main loop (i.e., have gone through the whole machinery of expansion) and have an impact on typesetting. Definitions are performed right away.Implementing
unravel
was and is of course difficult, and I ended up having to very often rely on the TeX source code to find out how Knuth did various things. In particular, the nesting of conditionals is a nightmare. Those who know the source of TeX can probably recognise large traces of the influence in names such as\@@_scan_int:
or\@@_get_x_next:
. Also, each primitive is given a command code and a character code, whose values follow precisely those ofpdfTeX
.One direction I would like to explore is to output all the data to an XML file (or other) which could then be processed through various other tools, perhaps giving more interactivity. Another interesting aspect would be to produce typeset content rather than on-screen diagnostic. This would allow to differentiate more visually, for instance, between the pieces of the
||
region, and it would allow to also differentiate category codes through color, which can be important in some debugging tasks. In the direction of producing less steps, I am wondering if giving a regular expression and performing silently expansions which pertain to commands matching with the regular expression would provide a useful filter.