[Tex/LaTex] Dynamic calculation within regular expression quantifier in LaTeX3’s l3regex package

calculationsl3regexlatex3

I am trying to do some simple calculations to be applied to a regular expression quantifier used in l3regex's \regex_replace_all:nnN function. I've built my code around what I found here: Defining a find and replace algorithm using LaTeX3's l3regex.
I've also consulted l3regex's documentation but I can't really figure this one out.

Here's what I got right now, the new command \redhighlight highlights words of a specific length in red:

\documentclass[a5paper,14pt]{article}
\usepackage[ngerman]{babel}
\usepackage{expl3, l3regex, xparse}
\usepackage{xcolor}

\ExplSyntaxOn
\tl_new:N \l_redhighlight_tl
\NewDocumentCommand \redhighlight { O{1} m } {
    \tl_set:Nn \l_redhighlight_tl { #2 }    
    \regex_replace_all:nnN {
        % note: I am using \"?\w to match German Umlaut's
        (\"?\b\w)((?:\"?\w){#1})\b
    } {
        \cB\{\c{color}\cB\{red\cE\}\1\2\cE\}
    } \l_redhighlight_tl
    \tl_use:N \l_redhighlight_tl
}
\ExplSyntaxOff


\begin{document}

\redhighlight{Per default, all two-letter words 
                  are highlighted in red.}

\redhighlight[2]{By providing an optional integer 
                     value, one can state the length 
                     of words to be highlighted.}

\end{document}

Within the \regex_replace_all:nnN regular expression definition, i.e. (\"?\b\w)((?:\"?\w){#1})\b, rather than using the optional #1 parameter directly I'd like to do some calculation before using it in the quantifier expression.
I've tried the following:

\ExplSyntaxOn
\tl_new:N \l_redhighlight_tl
\int_new:N \l_optquant_int
\NewDocumentCommand \redhighlight { O{1} m } {
    \tl_set:Nn \l_redhighlight_tl { #2 }    
    \int_set:Nn \l_optquant_int { #1 - 1 }
    \regex_replace_all:nnN {
        % note: I am using \"?\w to match German Umlaut's
        (\"?\b\w)((?:\"?\w){\l_optquant_int})\b
    } {
        \cB\{\c{color}\cB\{red\cE\}\1\2\cE\}
    } \l_redhighlight_tl
    \tl_use:N \l_redhighlight_tl
}
\ExplSyntaxOff

But this does not seem to work. I'd appreciate any help.

Best Answer

You can only use literal numbers in the {n} part and, anyway, an integer cannot be used to get a literal number.

You have to fully expand the numeric expression to a decimal number, but you also have to ensure not expanding too much; the best is to do the calculation before passing the argument:

\documentclass[a5paper]{article}
\usepackage[ngerman]{babel}
\usepackage{expl3, l3regex, xparse}
\usepackage{xcolor}

\ExplSyntaxOn
\tl_new:N \l_flor_redhighlight_tl
\NewDocumentCommand \redhighlight { O{1} m }
 {
  \flor_redhighlight:fn { \int_to_arabic:n { #1 - 1 } } { #2 }
 }

\cs_new_protected:Npn \flor_redhighlight:nn #1 #2
 {
  \tl_set:Nn \l_flor_redhighlight_tl { #2 }    
  \regex_replace_all:nnN
   {
    % note: I am using \"?\w to match German Umlaut's
    (\"?\b\w)((?:\"?\w){#1})\b
   }
   {
    \c{textcolor}\cB\{red\cE\}\cB\{\1\2\cE\}
   }
   \l_flor_redhighlight_tl

   \tl_use:N \l_flor_redhighlight_tl
}
\cs_generate_variant:Nn \flor_redhighlight:nn { f }
\ExplSyntaxOff


\begin{document}

\redhighlight{As a default, only one letter words
                  are highlighted in red.}

\redhighlight[2]{By providing an optional integer 
                     value, one can state the length 
                     of words to be highlighted.}

\redhighlight[3]{By providing an optional integer 
                     value, one can state the length 
                     of words to be highlighted. F"ur}

\end{document}

enter image description here

Note that is preferred that \NewDocumentCommand passes control to an internal function, if the code is not really simple. In this case it's even essential! You can appreciate the power of “generating variants”.

Functions and variables should have a common prefix to avoid conflicts as much as possible. Also it's better \textcolor{red}{stuff} to {\color{red}stuff}.

Some explanations

What happens with this code? The main internal function \flor_redhighlight:nn expects, as its first argument, an explicit number to be used in a quantifier. However, the quantifier should be one less than the stated number, so passing [2] to \redhighlight really highlights two letter words and not three letter ones.

So the argument is passed in the form \int_to_arabic:n { #1 - 1 } to the variant \flor_redhighlight:fn, which essentially does

\flor_redhighlight:nn {<full expansion of #1>} { #2 }

One could have defined the variant with x instead of f and the result would have been the same. The difference is that x uses internally an \edef, while f works by pure expansion without resorting to \edef.

Related Question