What is the scope of \catcode

tex-core

I want to use underscore as a usual char in most cases, so I do this in global:

\catcode`\_=11

Usually, I work in Chinese context, the code ~ is not so useful, so I use it to replace code 8 _:

\catcode`\~=8

It works perfectly in Chinese workaround, including tikz, reference and so on.


But when I write a English document, sometimes I want to keep code 13:

\catcode`\~=13

To use underscore as a simple printable char and use subscript at the same time, I defined a new command for subscript:

\def\mf#1{\catcode`_=8 _{#1}\catcode`_=11}

What I desire is that $a\mf{b}$ can print a subscript "b" after "a". But it printed a visual underscore "_" between "a" and "b", just like a_b.

What happened in the command \mf{b}? It is about order or scope? And how to implement \mf?

Besides, when I use the package \usepackage{underscore}, it will make an error at underscore in reference \ref{_}. So I tried implementing it by primitive command.

Thanks.

Best Answer

Answering the question in the title, for the record, the scope of \catcode is the current TeX group, as usual with most assignments (unless preceded by \global).

But the problem here is not scope, rather the time of tokenisation. As TeX processes a line of input, it translates every character (or sequence) it sees into a token. A token may be categorised as character tokens (for example _8 or A11) and control symbol/sequence tokens (for example \, or \relax). TeX turns a character into a token as soon as it “sees” the character for the first time, and that token is then frozen and cannot be changed to something else.

The practical implication of this in your case is that here:

\catcode`\_=11
\def\mf#1{\catcode`_=8 _{#1}\catcode`_=11}

the first line makes a _ be a catcode 11 token, then the second line does the definition, and the definition of \mf ends up with all _ with catcode 11, and then _11{#1} just typesets a _. To make a subscript you need a _8 instead.

There are a few ways you could do that, but the easiest is to use \sb, defined in the LaTeX kernel as:

\let\sb=_

so that \sb behaves exactly as a _8. You can also do \let\mf=\sb so that you can use \mf as an alias.