[Tex/LaTex] the difference between local and global in a TeX meaning

groupingmacros

In tikz declare function and babel french option it came to the question about the difference between local and global in a TeX meaning.

In the liked question the babel shorthand ; should be deactivated inside a {tikzpicture}, which can be done with

\begin{tikzpicture}
   \shorthandoff{;}
   ...
\end{tikzpicture}

Best Answer

The concept of global and local has to do with assigning meaning to a token or a value to a register.

Tokens and their meaning

Tokens are the basic food of TeX and they come in two varieties: character tokens and symbolic tokens. Roughly speaking, the latter are those that we represent as "backslash+characters".

The meaning of a character token depends on the combination of "character code" and "category code" of that character; for instance, under the normal settings, the character A corresponds to the pair (65,11), while { to the pair (123,1). It's category code 1 that gives { its special meaning as group and argument delimiter. Similarly, \ corresponds to (92,0), so that the backslash acts as marker for symbolic tokens.

Many tokens have a predefined meaning when TeX starts a job. Every character token has a character code (that can't be changed) and a category code (which can be changed later); some symbolic tokens are primitive, so that TeX doesn't start from nothing.

Assignments

At any moment, new symbolic tokens can be assigned a meaning or the meaning of already known tokens can be changed (for character tokens only the category code). What TeX does with a token depends on the context. For instance, \tracingmacros is a (primitive) integer register; with

\tracingmacros=1

TeX will assign the value 1 to the integer register, while with

\count255=\tracingmacros

the current value stored in \tracingmacros will be assigned as the value stored in the integer register \count255.

Both are examples of assignments. Every time the meaning of a token or a value stored in a register or array is modified, we are doing an assignment. Listing all kinds of assignments would be too long. The basic types are, in addition to those shown before,

\def\cs<parameter text>{<replacement text>}
\let\cs=<token>
\catcode<8 bit number>=<4 bit number>
<font selector>

(when LuaTeX or XeTeX are used, the <8 bit number> is <21 bit number>). In the last type <font selector> denotes a symbolic token that selects a font as the current one (such an assignment is performed behind the scenes by LaTeX when one uses \bfseries or \textit). There are many more, but this would take us too far.

To be precise, assignments cannot be performed at really any time. TeX doesn't perform assignments only when executing commands (not when it is expanding macros); in particular it will not perform assignments when computing the replacement text in an \edef or when writing out the result of a \write operation.

Active characters

A particular type of character token is one whose category code is 13. When TeX founds such a character it looks for its meaning as if it was a symbolic token and that meaning has to be assigned with \def (and variations thereof) or \let. This is the well known case of ~.

Groups

By rule every assignment is scoped and it loses its force when the current group ends. What's a group? TeX can open a group in several ways:

  1. With a { that hasn't the role of argument delimiter; the group will end with the } at the same brace level. One can use \bgroup for { or \egroup for } with the same effect (simple group).

  2. With \begingroup; the current group will end with the corresponding \endgroup (semisimple group).

  3. When the text to be typeset in a \vbox, \vtop, \vcenter or \hbox is being absorbed (box group).

  4. When the text for an alignment cell is being absorbed; it is almost the same as a box group (alignment group).

  5. When $ or $$ start a math formula (math formula group).
    The tokens $$ that start or end a math display are automatically supplied by LaTeX; never use $$...$$ directly in LaTeX

  6. Other special cases that can be found in the TeXbook or in TeX by Topic (for instance, \left starts a group, ended by the matching \right).

Thus \begingroup\tensl ABC \endgroup DEF will print "ABC DEF" because the assignment of the current font ends its force at \endgroup. Conversely

\def\my#1{-- #1 --}
\my{\tensl ABC} DEF

will print "-- ABC -- DEF" and continue to use slanted type, because the \tensl declaration has not been seen inside a group.

Global assignments

Each assignment can be prefixed by \global and, in this case, its effect will transcend the current group. An abbreviation for \global\def is \gdef; so

\begingroup
\catcode`-=\active % \active here means 13
\gdef-{not a hyphen}
\endgroup

shows both kinds of assignments at the same time. The category code of - is changed only locally, but the meaning of - as active character is assigned globally. Whenever TeX will find a (45,13) pair, the definition of - as active character will be used, but - won't be active by default after \endgroup. More on this later.

LaTeX

What's to be considered as a group in LaTeX? In addition to the general TeX constructs above, also environments form groups, because of how LaTeX implements \begin and \end, issuing \begingroup with the former and \endgroup with the latter. With a notable exception, that of the document environment which is however important only from a theoretical point of view, since there can be only one document environment. But also the arguments to \mbox, \makebox, \fbox, \framebox, \sbox, \savebox and \parbox will be absorbed as groups, because they internally use \hbox, \vbox, \vtop or \vcenter. Similarly, every cell in a tabular or tabbing environment will make a group by itself, being a special case of an alignment group.

The LaTeX commands \newcommand, \renewcommand, \newenvironment and \renewenvironment are wrappers for \def, so they act only locally, but can't be preceded by \global. So there's basically no way to do global assignments in LaTeX except by using lower level commands. An exception are assignments to LaTeX counters which are always global (see at the bottom).

The problem with the French semicolon

In order to cope with the peculiarities of French typography, babel changes some characters to category code 13 (active). This can cause problems with TikZ that uses the semicolon as its statement delimiter, but also the colon in its syntax. The developers of TikZ have tried their best in avoiding problems with this setting, but sometimes things can go wrong. What does

\usepackage[french]{babel}

do precisely? The details are quite involved, but among other things the following will basically be performed (however, this is a very broad simplification):

\begingroup
\catcode`;=\active % \active here means 13
\gdef;{\unskip\penalty10000 \thinspace\string;}
\catcode`:=\active % \active here means 13
\gdef:{\unskip~\string:}
\endgroup
\AtBeginDocument{\catcode`;=\active \catcode`:=\active}

So when ; or : are found in the document, they will be active and expand like a macro to the replacement text above. This of course is not what TikZ expects from the statement delimiter, for instance.

An example is in tikz declare function and babel french option

\documentclass{article}
\usepackage[french]{babel}
\usepackage{tikz}

\begin{document}
\begin{tikzpicture}
  \tikzset{declare function={Carre(\t)=\t*\t;}}
  \draw plot [domain=-1:1] (\x,{Carre(\x)});
\end{tikzpicture}
\end{document}

The semicolon delimiting the \draw statement is correctly interpreted, because TikZ knows how to deal with an active semicolon in that situation. But the statement delimiter in \tikzset is seen as such only too late.

In order to solve the issue there are two methods:

  1. Put the \tikzset declaration in the preamble, where the semicolon is not active.

  2. Use a local assignment to make the semicolon inactive in the tikzpicture environment

Thus both

\documentclass{article}
\usepackage[french]{babel}
\usepackage{tikz}
\tikzset{declare function={Carre(\t)=\t*\t;}}

\begin{document}
\begin{tikzpicture}
  \draw plot [domain=-1:1] (\x,{Carre(\x)});
\end{tikzpicture}
\end{document}

and

\documentclass{article}
\usepackage[french]{babel}
\usepackage{tikz}

\begin{document}
\begin{tikzpicture}
  \shorthandoff{;}
  \tikzset{declare function={Carre(\t)=\t*\t;}}
  \draw plot [domain=-1:1] (\x,{Carre(\x)});
\end{tikzpicture}
\end{document}

will work; \shorthandoff{;} is Babel's way of turning off (locally) the active nature of a character token and basically means

\catcode`;=<whatever category code ; had at the start of the job>

without forcing us to know that the code to use is 12. Since the assignment is local, it will end at \end{tikzpicture}, so after that the character ; will be again active.

With TikZ version 3.00, the babel library has been introduced, so the problems discussed above may be irrelevant when

\usetikzlibrary{babel}

is used. (Added 2015-01-02.)

Other uses of local assignments

Suppose one wants to have "named theorems" in addition to "generic" ones. For instance in the book one's writing it is planned to have

Rank-nullity theorem 1.4
Gram-Schmidt theorem 2.3

among other statements labeled plainly Theorem. One strategy can be to define a new theorem-like environment for each name, sharing the counter with the plain theorem environment. One can do better, though:

\newtheorem{theorem}{Theorem}[chapter]
\newtheorem{namedtheoremaux}[theorem]{\protect\thistheoremname}

\newcommand{\thistheoremname}{}
\newenvironment{namedtheorem}[1]
  {\renewcommand{\thistheoremname}{#1}\begin{namedtheoremaux}}
  {\end{namedtheoremaux}}

so that the input can be

\begin{namedtheorem}{Rank-nullity theorem}
...
\end{namedtheorem}

...

\begin{namedtheorem}{Gram-Schmidt theorem}
...
\end{namedtheorem}

The \protect in the definition is not necessary with the standard \newtheorem of LaTeX or the one provided by amsthm; it is for ntheorem and does no harm otherwise.

The redefinition of \thistheoremname is local, so it will affect only the current namedtheorem environment.

Exceptions

Some assignments made by TeX are inherently global. The most frequently used assignments of this kind are

\hyphenchar\font=`-
\hyphenchar\font=-1

The first one tells TeX that the character to use as a hyphen when typesetting in the current font should be the normal hyphen; the second one tells that hyphenation with this font is suppressed.

It's not a good idea to say

This is a normal paragraph and the following word must not be
{\hyphenchar\font=-1 hyphenated} but others can

because no word in the paragraph and in the following ones using the same font will be hyphenated, since the assignment is global even if not preceded by \global; however

This is a normal paragraph and the following word must not be
\hyphenchar\font=-1 hyphenated \hyphenchar\font=`- but others can

won't do the trick either, because TeX uses only one hyphenchar for a font, the one that's associated to it when the paragraph ends.

LaTeX again

LaTeX operations on counters

  • \setcounter
  • \addtocounter
  • \stepcounter and \refstepcounter

are realized as global assignments. To the contrary, \setlength operates locally (and preceding it by \global doesn't work).