[Tex/LaTex] A guide to understanding expandability: when to write protected functions and when not to

expansionlatex3

I'm having difficulty understanding (and appreciating) the concept of expandability. I'm very murky about understanding when and how expandability impacts me in writing code for my documents.

I've read Why isn't everything expandable?. The answer was interesting and useful, but it didn't get at the heart of what I'm curious about. I've also perused a number of the answers to other questions involving expandability: of particular interest was this post.

In respondence to a recent question of mine, it was explained that document commands are protected and hence not expandable. Understanding this allowed me to write what I wanted and to get the effect I expected.

And in a comment to another question of mine, it was explained how one should use \cs_new_protected:Npn "when the function does unexpandable jobs such as setting token lists or sequences."

For years, I've been writing code like

\newcommand{\currentanswer}{}
\newcommand{\setcurrentanswer}[1]{\renewcommand{\currentanswer}{#1}}

knowing that after calling \setcurrentanswer, any call to \currentanswer will result in the desired output. Am I relying upon (un)expandability here? I'm not really sure; I only know that it does what I want. Then there are times I know I can throw in a \protect to get the result I want: but, I really don't understand the why of it; I just know it gets the job done.

Recently, I've been trying to learn some LaTeX3: the more I play with it, the more I like it. LaTeX—which I always thought was pretty powerful—is suddenly much more powerful and transparent in the manner that macros and functions can be defined. But now, I also seem to be running up against this issue of expandability, whereas before I could blithely go about my business ignorant of some of the subtlies of what I was doing.

While I am asking multiple questions here, I suspect that they really have much the same answer: hence I'm not splitting them across multiple posts.

  1. Could someone take the time to explain some of the nuances of expandability, or, if not, point me to a good reference?

  2. How do I know when I'm working with a protected function/macro?

  3. Is protected and unexpandable the same thing?

  4. Could someone explain the preference for protected functions in LaTeX3?

  5. And finally, apart from the answers to the above questions, why would it be preferrable to protect functions which perform unexpandable tasks: such as setting tokens and sequences? (I am very interested in understanding this last question.)

Best Answer

An expandable command is one which can be converted 'fully' into it's output inside a TeX \edef or \write (and a few other places). Thus for example

\def\testa{\testb}
\def\testb{\testc}
\def\testc{d}
\edef\teste{\testa}
\show\teste

will give

> \teste=macro:
->d.

i.e. all of the steps have been expanded, and we have just characters.

For text, this is nice and simple, but when you get TeX primitives involved things are more complex as some are expandable and some are not. Broadly, anything which performs an assignment is unexpandable. So if we have

\def\testa{\testb}
\def\testb{\testc}
\def\testc{\def\ARG{d}}
\def\ARG{}
\edef\teste{\testa}
\show\teste

we get

> \teste=macro:
->\def {d}.

Notice how the \def is left unchanged but the \ARG has vanished: it got expanded to what it is defined as (empty).

e-TeX allows us to define a protected macro. These do not expand inside an \edef, so

\def\testa{\testb}
\def\testb{\testc}
\def\testc{\def\NOTARG{d}}
\protected\def\NOTARG{}
\edef\teste{\testa}
\show\teste

now yields

> \teste=macro:
->\def \NOTARG {d}.

There is a subtle but important point here: \def is an unexpandable primitive, while \NOTARG is now a protected macro. You can tell that \NOTARG is protected using \show:

> \NOTARG=\protected macro:
->.

where the \protected tells us what we need to know. However, you have to know that \def is not expandable.


In the LaTeX3 documents, rather than expect people to learn the rules we've gone with a different approach: we document which functions are expandable (they are marked with a star). The reason everything else is then protected is that 'partial' expansion is a real issue. If you do

\def\testa{\let\testb\testc}
\edef\testb{\testa}

you get

! Undefined control sequence.
\testa ->\let \testb 
                     \testc 

as \let is unaffected by the \edef but \testb is undefined. This gets worse when you look at 'real' documents, as the problem can be hidden many layers down.

Many of the issues people see in real LaTeX2e documents, for example where they forget \protect and have trouble, would be bypassed if most commands were protected. In general, you find a lot more (La)TeX code that is not expandable than code that is, so the position for LaTeX3 is that this is the exception, certainly for document commands. (Typesetting is not expandable, and that's what happens in documents.)

This leads us on to what I call the 'sheep and goats' approach to protected functions: all LaTeX3 code is either protected or fully expandable ['safe' (will give the expected result) inside \edef/x-type expansion], even if we are talking about auxiliary functions. The result is that we can always be sure if a function can be used in an expansion context: if it can, it's marked with a star, otherwise it will be protected and won't expand part way. So the 'correct' way to write LaTeX3 code is that if you use anything that is not expandable (i.e. not starred in the documentation) in your code, then you have to use \cs_new_protected:Npn or similar, and not \cs_new:Npn, etc.

Related Question