[Tex/LaTex] Expansion in \numexpr…\relax versus \pdfstrcmp

e-texexpansionpdfstrcmppdftextex-core

The \numexpr...\relax construction in eTeX allows to evaluate numerical expressions, and it expands tokens fully as it goes.

The \pdfstrcmp{...}{...} construction in pdfTeX lets us compare two lists of tokens after full expansion and conversion to a string (with \detokenize).

Are there specific token lists (parameter-less macros) \foo such that \the\numexpr\foo\relax correctly produces an integer, but \pdfstrcmp{\foo}{} causes a TeX error? It seems that the expansion behaviour is the same in both cases, but one converts its argument to an integer, and the other one to a string.

Best Answer

I see two cases where \the\numexpr...\relax works, but \pdfstrcmp{}{...} will blow up, excluding the obvious case where ... is replaced by 0\relax\undefined, terminating the \numexpr prematurely.

  1. TeX interprets `\a as a number, without expanding \a. Hence, \the\numexpr`\a\relax expands to 97 (the character code of a), whereas \pdfstrcmp{}{`\a} blows up if \a is not defined.

  2. Using \protected control sequences can also cause trouble, because those are forcefully expanded "from the left" in a \numexpr, but will not be expanded by \pdfstrcmp. Take for instance

    \protected\def\gob#1{}
    \the\numexpr 0\gob\undefined  \relax
    \pdfstrcmp{}{0\gob\undefined}
    

In the case of \numexpr, \gob is expanded and removes the \undefined control sequence. In the second case, however, the \edef-like expansion leaves the \protected control sequence \gob untouched, and goes on to expand \undefined, which is, well, undefined.

The original goal I had was to define a macro which takes in an argument which can be either empty or an integer expression, and evaluates the integer expression or puts a default value in the case of an empty argument. It seemed illogical to perform expansion in the \numexpr case but not for the emptyness test, and I was thinking of testing with \pdfstrcmp{}{...}. That can't work. An uglier but more correct choice is the following:

\catcode`@=11
\def\evaluate#1{\expandafter\evaluate@\the\numexpr#1\z@\z@\relax}
\def\evaluate@#1\z@#2\relax{#1}

\evaluate{1+2+3}
\evaluate{\empty}
\evaluate{\@gobble\a}
\evaluate{`\a}

If the argument to \evaluate is empty or expands to an empty argument, the \numexpr expansion will go through all of it and reach the first \z@, evaluating that to 0 (default value), then stop because \z@ does not make sense in an integer expression there. The auxiliary cleans up.

On the other hand, if the argument to \evaluate is a correct integer expression, it is evaluated, and \numexpr stops expanding when encountering the first \z@, and the cleaning up macro removes both \z@.

I just thought of a better way: "f-expand" (expand fully from the left, stopping at the first non-expandable token, removing it in case it is a space) the argument before testing for emptyness:

\def\evaluate#1{\expandafter\evaluate@\expandafter{\romannumeral-`0#1}}
\def\evaluate@#1{\the\numexpr\ifcat X\detokenize{#1}X\z@\fi#1\relax}

If the argument is empty or will expand to become empty, \romannumeral-`0#1 expands to nothing, and the test in \evaluate@ is true, which means we insert \z@ (default value). Otherwise #1 is evaluated.