There are roughly two ways to patch a command: via \scantokens
, and via expansion+redefinition. There's a (not so) brief explanation of both at the end of this answer. When ltcmdhooks
can detect the type of command, so that it knows exactly the <parameter text>
of the command, it patches by expansion+redefinition, so it has no restriction on the catcode settings in force when the macro was defined. In the case of \appendix
, it takes no arguments, so it can be treated as a token list and expanded, then redefined with the added material.
For example, here's a simple sketch of how it works:
\def\appendix{%
\typeout{This starts the appendix.}}
\def\append#1{%
\expandafter\appendaux\expandafter{#1}#1}
\def\appendaux#1#2#3{%
\def#2{#1#3}}
\append\appendix{\typeout{I added this.}}
\appendix
However, what I did not anticipate when I wrote that code, is the case when the original definition of \appendix
contains ##
(try this definition in the code above):
\def\appendix{%
\typeout{This starts the appendix. ##BOOM!}}
When \appendix
is defined like that, TeX's definition scanner sees #6#6
, and replaces that by a single parameter token #6
in the definition of \appendix
, so far so good. However when you expand the command, TeX also returns a single #6
, and then when you try to redefine the command you have:
\def\appendix{%
\typeout{This starts the appendix. #BOOM!}%
\typeout{I added this.}}
which contains an illegal parameter (#B
), and the definition errors.
I have changed ltcmdhooks
to handle this case (there's a brief explanation below), but meanwhile you can use \ActivateGenericHook
(or \ProvideHook
in LaTeX 2021-06-01) to tell ltcmdhooks
that you have already patched the command, so it won't try patching, then you do the patching manually using etoolbox
:
\documentclass{book}
\usepackage{cleveref}
\usepackage{etoolbox}
\IfFormatAtLeastTF{2021-11-15}%
{\ActivateGenericHook}% LaTeX > 2021-11-15
{\ProvideHook}% LaTeX = 2021-06-01
{cmd/appendix/before}
\pretocmd\appendix
{\UseHook{cmd/appendix/before}}
{}{\FAILED}
\AddToHook{cmd/appendix/before}{\label{appendix}}
\begin{document}
Hello world!
\appendix
\end{document}
Why the above works
The interface for ltcmdhooks
in \AddToHook
is supposed to work as follows:
If an end user writes \AddToHook{cmd/name/before}{code}
, and the hook cmd/name/before
doesn't exist yet (which implies that the command \name
doesn't have that hook "installed"), then the code tries to patch that hook in the command.
If the end user writes \AddToHook{cmd/name/before}{code}
, and the hook cmd/name/before
already exists, this (probably) means that the command \name
already has that hook, so it just adds the code to the hook, and leaves the command be.
This means that a package author may want to fine-tune the position of the cmd/name/before
hook (for example, \def\name{<some initialization>\UseHook{cmd/name/before}<definition>}
), then we don't want ltcmdhooks
patching the command again (it would be wrong to add the same hook twice), so we tell ltcmdhooks
that the hook already exists by saying \ActivateGenericHook{cmd/name/before}
, then patching is no longer attempted.
This works for your case because you then manually add the hook to the command, and then tell ltcmdhooks
that pathching is no longer needed. See section 3 Package Author Interface of the ltcmdhooks
documentation.
So in essence, you, as the package author, are appropriating the \appendix
command, by adding the hook yourself (exactly where ltcmdhooks
would add it), and then telling ltcmdhooks
to not patch it by using \ActivateGenericHook
.
If instead of \appendix
you were adding hooks to \UniqueCommandFromMyPackage
, then you could use \NewHook
instead of \ActivateGenericHook
(the effect would be identical), because there would be no possibility of a name conflict.
How LaTeX2ε handles this case now
The problem: Turns out in the described case we're in a dead-end. When you write a definition like
\def\foo#1{#1##X}
TeX stores its <replacement text>
as a token list containing:
out_param 1, par_token #, letter X
(out_param 1
is #1
to be replaced by the actual parameter when the macro is expanded, par_token #
is a catcode 6 #
, and letter X
is a catcode 11 X
).
Then, when you expand \foo
with #1
(par_token #, character 1
), TeX replaces out_param 1
and you have:
par_token #, character 1, par_token #, letter X
which is equivalent to typing #1#X
. If you plug that back into a new definition of \foo
you'll have:
\def\foo#1{#1#X}
which is obviously wrong (and thus the Illegal parameter number
error). And at this point you have no way to tell what was an actual parameter when the macro was defined, and what was a single parameter token.
Half solution: There is one very simple case that can be easily detected and solved (which coincidentally is the one in your question): a macro without parameters. In this case, the macro has no argument, so any loose ##
in its definition cannot possibly be confused with a parameter, so we can treat this such macros as token lists (in the expl3
sense) and do something akin to \tl_put_right:Nn
and problem solved.
Another relatively simpler case is when the macro has no ##
in its definition. In this case we don't have to worry about confusing parameters, so we treat the macro normally (this was the case implemented initially). LaTeX uses a rather simple loop to check if a macro has a parameter token in its definition (\__hook_if_has_hash:nTF
): it looks at every token in the defintion, and compares it with #
.
The other half: When the macro falls into the general case of having both parameters and parameter tokens in its definition (like \foo
above), then we have to manually re-double every parameter token in the definition, so that it can be re-made. To do that, instead of expanding \foo
with #1
, LaTeX expands it with \c_@@_hash_tl
, so \foo{\c_@@_hash_tl}
becomes a definition like:
\foo#1{\c_@@_hash_tl 1#X}
then we loop through the replacement text of the macro (inside the braces) and double every ##
, and replace every \c_@@_hash_tl
by a single #
, which then gives:
\foo#1{#1##X}
and then we can do the definition normally (phew!)
Patching with \scantokens
(wordier description here)
Suppose a macro defined with
\long\def\mycmd[#1]#2{\typeout{#1//#2}}
To append some code to it via \scantokens
, you first do \meaning\mycmd
to get a string like:
\long macro:[#1]#2->\typeout {#1//#2}
(with usual \detokenize
catcodes: all 12 except spaces, which are catcode 10), then you use a delimited macro to separate the <prefixes>
, the <parameter text>
, and the <replacement text>
, roughly like this:
\def\split#1{\expandafter\splitaux\meaning#1\relax}
\expanded{%
\noexpand\def\noexpand\splitaux#1\detokenize{macro:}#2->#3\relax}{%
\def\prefixes{#1}%
\def\parameter{#2}%
\def\replacement{#3}}
(I'm using \def\prefixes{#1}
, etc. for the sake of understandability, but in reality you would inject everything expandably instead; see the definition of \__kernel_prefix_arg_replacement:wN
in expl3-code.tex
, and \etb@patchcmd
in etoolbox.sty
if you're feeling brave).
At this point you have every part of the definition as a string separately. Now you can either append or prepend some code to \replacement
(or replace some part of it, as it's done in \patchcmd
), or in rarer cases change \prefixes
or \parameter
. At this point you have three strings, each of which is a part of the definition. To reconstruct the definition you need:
<prefixes>\def\mycmd<parameter text>{<replacement text>}
but the three parts you have are still catcode 12 tokens, which are no good. Here comes the \scantokens
part: you rescan those strings back to "normal" tokens:
\expanded{%
\noexpand\scantokens{%
% <prefixes>\def \mycmd<parameter text>{<replacement text>}
\prefixes \def\noexpand\mycmd\parameter {\replacement <added material>}%
}%
}
which, after \expanded
does its job, becomes:
\scantokens{%
\long\def\mycmd[#1]#2{\typeout {#1//#2}<added material>}%
}%
then \scantokens
does its thing and turns everything into tokens using the current catcode settings, and then the definition is carried out normally.
The advantage of this method is that you can do virtually any manipulation in any part of the definition.
The disatvantages are a few:
- You need to know what catcodes were in force when the definition was first made (when patching you usually need to verify that a simple round of
\meaning
–\scantokens
doesn't change the meaning of the macro) otherwise you can't patch safely;
- If the macro was created with some combination of
\edef
and \detokenize
to forcibly make some catcode 12 tokens, you will probably not be able to patch that macro (for example, \splitaux
as defined above in this answer cannot ever be patched with \patchcmd
because it contains letters (for example m
) of both catcodes 11 and 12);
- If the
<parameter text>
of the macro contains the characters ->
, you won't be able to patch the macro.
Patching with expansion+redefinition
This method is much simpler, but requires previous knowledge of how the macro was defined. This can be done in few cases, namely when you know exactly what the <parameter text>
of the macro is. The cases known by the kernel are when the macro was defined with \DeclareRobustCommand
, or with ltcmd
(\NewDocumentCommand
or \NewExpandableDocumentCommand
), or with \newcommand
with an optional argument, or when the macro takes no argument.
Suppose the same macro from before, but defined with:
\newcommand\mycmd[2][default]{\typeout{#1//#2}}
(it will have an internal macro called \\mycmd
, but for the sake of simplicity let's call it \mycmd
as well), then we know for sure its <parameter text>
is [#1]#2
. Knowing what arguments the macro expects, we can feed it #1
, #2
, ... as arguments, so for \mycmd
we would do:
\mycmd[#1]{#2}
which would then expand to the <replacement text>
of the macro, with the first parameter (#1
) replaced by #6112
(the parameter token #
followed by the character 1
). The patching scheme would be something like:
\expanded{%
\def\noexpand\mycmd[#1]#2{%
\unexpanded\expandafter{\mycmd[#1]{#2}<added material>}%
}%
}
then after the \expanded
is done you are left with:
\def\mycmd[#1]#2{\typeout{#1//#2}<added material>}}
which is exactly what you had with the \scantokens
approach, except that you didn't turn tokens into a string, so catcodes don't matter at all here.
The advantages of this method are roughly the disadvantages of the \scantokens
method:
- catcodes don't matter at all;
- you can patch complicated macros (including the
\splitaux
macro from before) using this method given you know exactly what its <parameter text>
is;
- the
<parameter text>
of the macro may contain any token your heart desires (as long as you know what token it is); and
- this method doesn't need a sanity check to ensure that the macro can be patched correctly.
The disadvantage is the requirement for the method to work: you need to know exactly what the <parameter text>
is.
Best Answer
Here are 5 different options:
e
-type expansion will fail ifpostheadhook
contains anything fragile.o
-type expansion doesn't work because it expands exactly once, and\exp_args:NnV
takes way more expansions than that to work.f
-type expansion is inappropriate 99% of the time, but it seems fine here. I'd slightly preferjust in case
\AddToHookNext
were to blow up when expanded, but it (and most other LaTeX commands) are protected, so this doesn't really make a difference in this case.(What would be best would be something like
\hook_gput_code:nne … \hook_gput_next_code:ne … \exp_not:V \l_mbert_thm_postheadhook_tl
, but that throws an error for some unknown reason.)