[Tex/LaTex] Why is the .toc file created at end document

auxiliary-fileslatex-basetable of contents

To create a Table of Contents (for example), LaTeX2e follows the process:

  • Run 1

    • .aux file is created from scratch after \begin{document}.
    • When processing section headers, write \addtocontents to the .aux file with the pertinent sectioning information.
    • At \end{document}, \input the .aux file, which executes \@writefile via \contentsline to generate the .toc file.
  • Run 2

    • At \begin{document}, deactivate \@writefile and re-input the .aux file. This generates labelling information, etc., for the current compilation and has no effect on the .toc file already generated.

    • After \begin{document}, the .aux file is opened for writing, which clears its contents. The process starts anew.

The 2e sources don't go into detail about why the system was designed this way. Since there are limited write registers, writing all .toc information in one pass at \end{document} minimises the number of files that need to be kept open during the document. So that aspect is sensible.

But why is the .aux file input both at the beginning and the end of the document? Couldn't the .toc file be generated at \begin{document} instead?

(One difference is that if ‘Run 1’ is terminated prematurely, no .toc file will be generated at all so there'll be no ToC. Whereas if the .toc file were created at \begin{document} it would contain a partial ToC — but that's arguably no better or worse. And you could add, say, a \@finishedauxtrue flag inserted at \end{document} to catch this happening if necessary.)

[[EDIT: Well, loading the aux file at the end allows for some checking whether labels have changed, and so on. So the aux file does always need to be loaded twice; I guess my question is whether there's any difference for ToC generation when that happens.]]

Best Answer

There are certainly several ways to implement a data transport mechanism between runs. The LaTeX one attempts to be rather economically with resources.

There are a number of tasks that it caters for:

  • TOC-type files are only generated if the document asks for a TOC. This only becomes clear somewhere within the document (i.e., not automatically available at \begin{document}) so this favors writing the TOC file at the end (not at the beginning). Otherwise you either have to always write them or carry that information from one run to the next (which is not possible via the aux file as it has to be known at the start of reading the aux file).
  • For the same reason directly writing the TOC file (rather than using the aux file as an intermediate) is not possible (unless you always generate them and the toc comes either at the very beginning or the very end of the document). Also reusing parts of the toc data, e.g., for chapter tocs, would then be impossible.
  • Rereading the aux file at the end is not only there for generating the TOCs but also to post process the data, e.g., cross-reference checking, so it is needed in any case.
  • Reading the aux file at the beginning is necessary to obtain the data from the previous run (e.g., cross references).

So yes, one could produce the TOC-type files at the beginning, but then would either need an additional mechanism to pass the info which files to generate or one would need to produce them always, even if there is no TOC or LOT or ... being asked for. In short there is no gain (only loss).

Unless you want to read in the whole document in memory there is no way to avoid a multi-pass system (well it would be multi-pass even then) and keeping the whole document in memory is not economical with TeX and in fact not really with any system. So it was/is not a historical artifact because of small memory, it is a general necessity.