Parsing formatted arguments with expl3

expl3macrossubscriptssuperscripts

I want to parse subscripts and superspricts using expl3 for which I used \def before:

\documentclass{article}

\usepackage{expl3}

\ExplSyntaxOn
\cs_new:Npn \l__parsing_parse_superscript:n ^#1 {superscript = #1}
\cs_new:Npn \l__parsing_parse_subscript:n _#1 {subscript = #1}
\NewDocumentCommand\parsesuperscript{}{\l__parsing_parse_superscript:n}
\NewDocumentCommand\parsesubscript{}{\l__parsing_parse_subscript:n}
\ExplSyntaxOff

\begin{document}
\noindent
\parsesuperscript^10\\
\parsesubscript_100 % TODO error
\end{document}

The superscript gets parsed correctly. However for the subscript an error occurs, probably because after \ExplSyntaxOn the underscore is treated as a letter. I also tried \c_underscore_str and \c_math_subscript_token instead of _ in the command definition. How can the underscore be escaped for parsing a subscript? Or are there other solutions for parsing arguments with expl3 that should be prefered?

References:

Best Answer

Your diagnostics is correct: the problem is that in expl3 syntax, the _ has catcode 11. However using \c_underscore_str is no good either because that has catcode 12, but in the document the _ (usually) has catcode 8.

In your definition you have to enforce catcode 8 for the underscore:

\ExplSyntaxOn
\cs_new:Npn \__michael_parse_superscript:w ^#1 {superscript = #1.}
\use:e
  {
    \cs_new:Npn \exp_not:N \__michael_parse_subscript:w
        \char_generate:nn { `\_ } { 8 } #1
      {subscript = #1.}
  }
\NewDocumentCommand\parsesuperscript{}{\__michael_parse_superscript:w}
\NewDocumentCommand\parsesubscript{}{\__michael_parse_subscript:w}
\ExplSyntaxOff

(note also that with \parsesubscript_100, only the 1 is grabbed as argument: you need \parsesubscript_{100}).

But that's not really parsing, because if you use the command without the following _ or ^ you'll get an error. You could use the e-type argument instead (note that it works regardless of the order the actual arguments appear):

\documentclass{article}
\usepackage{expl3}
\NewDocumentCommand \parsesupsub { e{^_} }
  {
    \IfValueT{#1}{superscript = #1.\\}
    \IfValueT{#2}{subscript = #2.\\}
  }
\parindent=0pt
\parskip=10pt
\begin{document}
\parsesupsub^{10}

\parsesupsub_{100}

\parsesupsub_{100}^{10}

\parsesupsub^{10}_{100}
\end{document}

You could also parse manually with \peek_charcode_remove:NTF _ { <with> } { <without> }, then it would work regardless of the current catcode of _.

On naming (I wrote an explanation here), the \l_ (or \g_ or \c_) prefix should be used for variables only. You are defining commands, so they should start with \module_... if public, or \__module_... if private (I used michael as the module name). Also, as Gaussler noted in the comment, the :n argument type should be for "normal" arguments (tokens delimited by {...}). Since you have a "weird" token in the parameter text, you should use :w.

Related Question