TeX-Core – Exploring the Components of (La)TeX’s Memory Usage

memorytex-core

After reading Increase LaTeX capacity and not being able to summon Gandalf1, I am curious to know what contributes to the components of memory usage during a (La)TeX compilation. Take the above post's output as an example:

l.3593 ...temp.png}

If you really absolutely need more capacity,
you can ask a wizard to enlarge me.


Here is how much of TeX's memory you used:
 31937 strings out of 94500
 1176767 string characters out of 1176767
 272586 words of memory out of 1000000
 24170 multiletter control sequences out of 10000+50000
 11185 words of font info for 39 fonts, out of 500000 for 2000
 580 hyphenation exceptions out of 1000
 28i,7n,36p,345b,3810s stack positions out of 1500i,500n,5000p,200000b,5000s
PDF statistics:
 33619 PDF objects out of 300000
 7356 named destinations out of 131072
 48094 words of extra memory for PDF output out of 65536
!  ==> Fatal error occurred, the output PDF file is not finished!

What, in a "regular document" that includes packages, contains user-defined macros and environments, constitutes a string, a string character, a word, a multiletter control sequence, words of font, a hyphenation exception, stack positions and (when running pdftex) a PDF object, a named destination and word of extra memory?

The intent of this question is more to understand where one might run into problems if you are (say) compiling a 4000 page document – yikes! Perhaps, a more realistic scenario might be that you are typesetting a very large document and you include a large number of packages even though you only use a select few macros/environments from each package. (La)TeX still loads the entire package into memory, leaving you with less to work with.

Reading pdf objext/destinations/memory limits only suggests where one might have problems and perhaps how to boost (La)TeX's available capacity. However, it doesn't state which parts of the document contribute to which memory component. Similar to other posts I've found.

Some of the memory output may be self-explanatory, but not all. For example, multiletter control sequences probably refer to definitions like \newcommand{\mycom}{...} and \def\stuff{...} each of which I assume puts a +1 to the tally. However, this seems to exclude \def\a{...} since it is a single letter control sequence? Also, does \def\stuff{...} add +1 to the string and +5 to the string characters?

Understand this will probably not sway me from any of my current usages, since they have never given me memory problems of such a nature without the problem being on my end rather on the compiler's. However, it may improve future (La)TeX programming of macros/environments.


1 By the way, TeX's output of If you really absolutely need more capacity, you can ask a wizard to enlarge me. is just epic.

Best Answer

Here is a short program that can print 152831 pages with no indication of memory problems.

\documentclass{article}
\usepackage{lipsum}
\begin{document}
\newcount\n
\n=0
\def\message{I can count to }
   \loop
   \ifnum\n<37000
   \advance\n by1
   \message\number\n : \lipsum[1-2]    

   \repeat
\end{document}

You can increase the \lipsum[1-2] and test both your patience as well as TeX's limits! As long as you allow TeX to break a page easily you are unlikely to hit any memory problems.

Knuth created a very efficient memory management system. The details can be found at the TeX source. A somewhat limited explanation of how memory and strings are handled can be found in my answer for Delimiting a macro argument with the macro itself. Check also for TEXPOOL on your distribution.

Not only Knuth but Lamport and all the LaTeX contributors took extreme care to preserve memory as well as to provide proper garbage collection. IMHO studying the TeX source should be required reading at all Computer Science classes. In the link above check the Dynamic Memory allocation section and note Knuth's comments (clause 119), which is the most common problem encountered with TeX new users:

If memory is exhausted, it might mean that the user has forgotten a right brace...

On a last note TeX does not keep your "book" in memory. It always works on a page at a time ... well almost. If you do not close with a right brace it will continue scanning, potentially until the end of the source and so it cannot de-allocate memory. Knuth introduced a lot of checks and special commands to avoid these issues (long, outer etc...).

1 Exercise for the reader, change the page height to half and see if you can double the pages.