When does tex do macro expansion

macrostex-core

Having read some chapters of the texbook, I know tex eats input lines from command line or files, then converts them into tokens. When these tokens go into its gastro-intestinal tract, tex prepares to digest them by converting them into boxes, glues and other whatsit, then construts some lists.

With regard to macros, in which stage tex save its macro definition, in which stage tex do the macro expansion for tex replacement?

Best Answer

That's easy! TeX always does macro expansion, except when it doesn't.

On page 215, second double dangerous paragraph, we read

Expansion is suppressed at the following times:

When tokens are being deleted during error recovery (see Chapter 6).

When tokens are being skipped because conditional text is being ignored.

When TeX is reading the arguments of a macro.

When TeX is reading a control sequence to be defined by \let, \futurelet, \def, \gdef, \edef, \xdef, \chardef, \mathchardef, \countdef, \dimendef, \skipdef, \muskipdef, \toksdef, \read, and \font.

When TeX is reading argument tokens for \expandafter, \noexpand, \string, \meaning, \let, \futurelet, \ifx, \show, \afterassignment, \aftergroup.

When TeX is absorbing the parameter text of a \def, \gdef, \edef, or \xdef.

When TeX is absorbing the replacement text of a \def or \gdef or \read; or the text of a token variable like \everypar or \toks0; or the token list for \uppercase or \lowercase or \write. (The token list for \write will be expanded later, when it is actually output to a file.)

When TeX is reading the preamble of an alignment, except after a token for the primitive command \span or when reading the after \tabskip.

Just after a token such as $₃ that begins math mode, to see if another token of category 3 follows.

Just after a `₁₂ token that begins an alphabetic constant.

Not unreasonable, is it? When doing definitions, we want that nothing is expanded (except for \edef and \xdef) and the fourth, fifth, sixth and seventh bullets deal with this. Similarly if we want to store a token list in a register or in \write.

Similarly, the token after \expandafter, \noexpand, \afterassignment or \aftergroup must not be expanded for obvious reasons; it will later, when TeX examines it again at the appropriate time.

The last bullet has a technical reason: If you want to refer to an alphabetic constant that corresponds to a character with \catcode 0, 5, 9, 13, 14, or 15 it can be “escaped” with a backslash in front of it, but this actually doesn't form a control sequence. So you can do `\^^M if you want to refer to the constant 13 or do \chardef\%=`\%.

The second bullet can be supplemented by an important remark: notwithstanding that TeX does no expansion when skipping conditional text, it does examine the tokens in order to match conditionals with their \else or \fi. Any token that is \let to a primitive conditional, to \else or \fi counts under this respect.

Related Solutions

[Tex/LaTex] Why isn’t everything expandable

While a definitive answer can only come from the Stanford team involved in development of TeX, and from Professor Knuth in particular, I think we can see some possible reasons.

First, Knuth designed TeX primarily to solve a particular problem (typesetting The Art of Computer Programming). He made TeX sufficiently powerful to solve the typesetting problems he faced, plus the more general case he decided to address. However, he also kept TeX (almost) as simple as necessary to achieve this. While expandable macros are useful, they are not required to solve many issues.

Secondly, there are cases where an expandable approach would be at least potentially ambiguous. Bruno's \edef\foo{\def\foo{abc}} is a good case. I'd say that here the expected result with an expandable \def is that \foo expands to nothing, but I'd also say this is not totally clear. There is the much more common case where you want something like

\begingroup
\edef\x{%
 \endgroup
 \def\noexpand\foo{\csname some-macro-to-fully-expand\endcsname}%
 }
 \x

which would be made more complex with expandable primitives.

The above example points to another grey area: what would happen about things like \begingroup and more importantly \relax. The fact that the later is a non-expandable no-op is often important in TeX programming. (Indeed, the fact that \numexpr, etc., gobble an optional trailing \relax is sometimes regarded as a bad thing.)

Finally, I suspect that ease of implementation is important. The approach of having separate expansion and execution steps makes the flow relatively easy to understand, and I also suspect to implement. An approach which mixes expansion and execution requires a more complex architecture. Here, we have to remember when Knuth was writing TeX, and the fact that programming ideas which we take for granted today were not necessarily applicable in the late 1970s. A fully-expandable approach would I suspect have made the code more complex and slower. The speed impact is one that was important when TeX was running on 'big' computers.

[Tex/LaTex] Why does TeX remove braces around delimited arguments

One thing to note is that TeX's only means of nesting arguments are braces. You can define a macro \def\whatever[#1]{...} but when you call it as \whatever[oh[well]], things go down the drain awfully. Calling it as \whatever[{oh[well]}] however works swimmingly, and \whatever never notices it has been taken for a ride by slipping a ] into its argument. So the braces can be used as a means of hiding occurences of the closing delimiter from TeX without actually affecting the intended argument.

It also means that whenever you call a macro using delimited arguments with a non-literal argument (more exactly, an argument not completely under your own control, as it often happens when you write a macro package to be used by others), you should always add a layer of braces around each delimited argument, like \whatever[{#1}] or similar. There is no other way to ensure that arguments will not get chopped up into something different because they themselves may contain a closing bracket.

Best Answer

Related Solutions

[Tex/LaTex] Why isn’t everything expandable

[Tex/LaTex] Why does TeX remove braces around delimited arguments

Related Question