[Tex/LaTex] \jobname, character codes and \detokenize

tex-core

I recently wanted to compare \jobname to a string (with the xstring package) and was rather surprised that the string never matched. Tho internet told me that I had to use \detokenize on the string. \detokenize is an e-TeX command, that

when followed by a <general text>, expands to yield a sequence of character tokens of \catcode 10 (space) or 12 (other) corresponding to a decomposition of the tokens of the <balanced text> of the unexpanded <general text>;

Unfortunately, I do not own a copy of the TeXbook, so here are three questions:

  • What is “balanced text”?
  • Why does \jobname expand to the job name with “wrong” catcode?
  • How did people compare \jobname to a string before e-TeX?

Best Answer

Three questions there, but I think you'll be let off!

'Balanced text' means that the argument has to have balanced grouping characters, usually { and } pairs. This is because \detokenize requires an argument starting with a token with category code 1 (begin-group), in the same way as a token register. Indeed, you can do very similar things with a token register and with \detokenize:

\newtoks\mytoks
\def\test{stuff}
\mytoks\expandafter{\test}% \mytoks holds 'stuff' as letters
\detokenize\expandafter{\test}% Ouputs 'stuff' as 'other' tokens

On the category codes in \jobname, there are a number of places where you get a 'string' from TeX where everything except spaces has category code 12. You see the same with \the\<somedimen> and \meaning (more on the later in a moment). You'd have to ask DEK for the full story, but my understanding is that this 'string' approach is used so that no tokens are accidentally added to a control sequence name. There are places where if they were 'letters' then trouble might arise.

Finally, on the approach before e-TeX. As I said, \jobname is not the only place where you see 'string' output. In particular, \meaning does the same. So if you do

\def\testa{<whatever>}
\edef\testa{\meaning\testa}
\edef\testb{\jobname}
\edef\testb{\meaning\testb}
\ifx\testa\testb
...

the test will be true if the two names agree as lists of characters. There are variations on this method, see for example LaTeX's \strip@prefix, which can be used to make a 'string' without any prefix:

\makeatletter
\def\testa{<whatever>}
\edef\testa{\expandafter\strip@prefix\meaning\testa}% Now a 'string'

(As pointed out by Martin Scharrer, LaTeX's \@onelevel@sanitize is the same as the above: \@onelevel@sanitize\testa would be equivalent to the last line above. To show what is going on it's clearer to see the \meaning but in use you'd pick \@onelevel@sanitize.)