Use str_foldcase in the preamble without generating a missing begin document error

active-charactersbest practicesexpl3xparse

Edit

Thanks to @PhelypeOleinik for pointing out the problem wasn't the \str_foldcase itself but the switch statement bodies trying to typeset text before the document has begun. The cognitive overload from trying to learn expl3 was so high that I lacked capacity to spot such a basic error.

New MWE

This works but doesn't print the value of the selected language, nor does it actually select a language. I'm not entirely sure the naming of the variable \l__myclass_lang_tl is "best practice" as this is my first time. Advice, corrections, and improvements on this would be welcome.

As I understand it so far, though I'm likely wrong or incomplete in multiple places:

`\l` means declare a local variable
`_` (the first one) is just a separator
`_` (the second one) is currently a mystery to me
`myclass` is referencing the name space of the class I'm creating
`_` is just a separator for readability because we're using snake_case
`lang` is the meaningful part of variable name
`_` another separator
`tl` I don't fully understand yet: I know it stands for "token list"
     but I'm not sure why this has to be a token list.
     I was expecting "str" for "string"
\documentclass[]{article}

\usepackage{xparse}
\ExplSyntaxOn
\tl_new:N \l__myclass_lang_tl

\NewDocumentCommand \selectLang { m }
  {
    \str_case_e:nnF { \str_foldcase:e { #1 } }
      {
        { english } { \tl_set:Nn \l__myclass_lang_tl {english} }
        { norsk }   { \tl_set:Nn \l__myclass_lang_tl {norsk} }
        { nynorsk } { \tl_set:Nn \l__myclass_lang_tl {nynorsk} }
        { samisk }  { \tl_set:Nn \l__myclass_lang_tl {samisk} }
        { samin }   { \tl_set:Nn \l__myclass_lang_tl {samin} }
      }
      {\tl_set:Nn \l__myclass_lang_tl {error}}
  }
\cs_generate_variant:Nn \str_foldcase:n { e }
\ExplSyntaxOff
\selectLang{Samin}
\renewcommand\selectlanguage{} % just to show that it works

\begin{document}
Chosen language: \selectLang{EnGlIsH}

\def\foo{NOrsK}
Chosen language: \selectLang{\foo}
\end{document}

Original (for posterity)

I believe I'm having trouble with \str_foldcase needing an active char but I'm not certain.

I'm writing my first class file and want to utilise expl3. I've only just started with expl3 and I'm still at the 'confused beginner' stage. I'm feeling very frustrated that I haven't been able to work this out for myself.

For class options which can take parameters, eg \documentclass[fruit=banana]{fruity} (with other valid options being apple and cherry) I want to process these in a case-insensitive way and found this promising-looking example which uses a case-insensitive switch statement to create a \selectLang command.

However when I try to call the \selectLang command it creates while still in the preamble (line 18) this causes a no begin document error.

I've done what I can to research this and it seems to be something about active char which I don't currently understand. I have seen this but I don't understand the answer and since it dates from 2015 and talks about a new mechanism, it seems out of date. This seems more modern but I still don't understand.

I have looked at expl3.pdf xparse.pdf and interface3.pdf but I'm too new to expl3 to really understand them. And as far I can tell none have any examples I can find showing how to use any variant of \str_foldcase so that it will work in the preamble.

MWE

This is my modified version of the original question. I know it's not a class I'm just trying to make case-insensitive string comparisons work in the preamble. The only two changes I've made to the original example are first to remove the \selectlanguage macro call because that generates an understandable error because you can't select a language at that stage, and second to add line 18 which calls \selectLang while still in the preamble.

I get the same error whether the \selectLang call in the preamble is before or after \ExplSyntaxOff

Removing \str_foldcase:e and corresponding braces means there is no begin document error which is what makes me think the problem is with \str_foldcase being run inappropriately. It works fine when run from the document contents.

\documentclass[]{article}

\usepackage{xparse}
\ExplSyntaxOn
\NewDocumentCommand \selectLang { m }
  {
    \str_case_e:nn { \str_foldcase:e { #1 } }
      {
        { english } { selectlanguage~british }
        { norsk }   { selectlanguage~norsk }
        { nynorsk } { selectlanguage~nynorsk }
        { samisk }  { selectlanguage~samisk }
        { samin }   { selectlanguage~samin }
      }
  }
\cs_generate_variant:Nn \str_foldcase:n { e }
\ExplSyntaxOff
\selectLang{Samin}  % Causes an error about a missing begin document
\renewcommand\selectlanguage{} % just to show that it works

\begin{document}
Chosen language: \selectLang{EnGlIsH}

\def\foo{NOrsK}
Chosen language: \selectLang{\foo}
\end{document}

generates

! LaTeX Error: Missing \begin{document}.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H <return>  for immediate help.
 ...                                              
                                                  
l.18 \selectLang{Samin}
                       
You're in trouble here.  Try typing  <return>  to proceed.

Best Answer

Adding the answer to the original question for reference: your usage of \str_case_e:nn tries to typeset selectlanguage <name>, so you can't use it in a class file (which is before \begin{document}). Now to the edited question.

Your \selectLang command is just “parsing” the user input, and setting the token list \l__myclass_lang_tl to contain a value depending on that. \selectLang does nothing else (neither does it print anything), which is exactly what you are seeing. If you want \selectLang to typeset something when you use it, you have to explicitly tell it to:

\NewDocumentCommand \selectLang { m }
  {
    \str_case_e:nnF { \str_foldcase:e { #1 } }
      { ... cases ... }
      { ... }
    Selected~language~is~\tl_use:N \l__myclass_lang_tl % <--- This line “uses” the token list
  }

then when you use it, it will typeset the text Selected language is <value of \l__myclass_lang_tl>. However as a matter of good programming principles, it is advisable to separate functions: one does the parsing of the user input and sets the \l__myclass_lang_tl, and another prints the text when you want it to, so something like:

\NewDocumentCommand \selectLang { m }
  {
    \str_case_e:nnF { \str_foldcase:e { #1 } }
      { ... cases ... }
      { ... }
  }
\NewDocumentCommand \printLang { }
  { Selected~language~is~\tl_use:N \l__myclass_lang_tl }

then setting and printing happen independently.


On variable and function naming in expl3

In expl3 you can have both public and private functions and variables. Public ones should be documented (preferably stable), and can be used by others, while private ones shouldn't have public documentation, and you are “allowed”1 to change or remove them at will, as long as the public interfaces aren't affected.

The naming scheme allows you to quickly identify what is a function and what is a variable, and easily distinguish public from private ones. Functions start with \module_... when public, or \__module... when private. The overall rule is:

\(__)<module>_<name>:<args>
% for example:
% \myclass_print_authors: or \__myclass_parse_list:nn

where the __ indicates when it's a private function, <module> is the prefix used for your class, package, or expl3 module, <name> is a descriptive name for the function, formed by one or more parts separated by _, and <args> is the function's (possibly empty) argument signature.

Variables can be distinguished from functions because they all start with the <scope>:

\<scope>_(_)<module>_<name>_<type>
% for example:
% \g_myclass_title_tl or \l__myclass_author_count_int

where the <scope> is either local, global, or constant, the extra _ indicates a private variable, <module> and <name> are as explained above for functions, and <type> is the type of variable (tl, int, str, fp, etc.).

[1] We're talking about TeX, so in practice you can redefine anything as long as you're willing to accept the risk, but the well-known issues about “this bug can't be fixed because another package relies on the wrong behaviour” made it necessary to have an agreement on what macros can be relied on, and which ones are subject to change without warning.

On tl vs. str

TeX doesn't have the concept of a “string” as in usual programming languages. Everything in TeX ends up being a token (with few exceptions) so often you end up needing to store such tokens, and that's why you use token lists. With the help of ε-TeX's \detokenize you can coerce tokens in a list all to have catcode 12 (other)2, which is pretty much a no-op (they have no special meaning), so such lists behave more or less like strings, so they receive the special name “string” (but are nonetheless token lists).

[2] Except spaces, which are catcode 10 (space).

Finally

\str_case_e:nn (and other \<thing>_case:nn(TF) functions) are expandable (marked with ☆ or ★ in interface3), so you can use that to make your code less repetitive:

\NewDocumentCommand \selectLang { m }
  {
    \tl_set:Nx \l__myclass_lang_tl
      {
        \str_case_e:nnF { \str_foldcase:e {#1} }
          {
            { english } { english }
            { norsk }   { norsk }
            { nynorsk } { nynorsk }
            { samisk }  { samisk }
            { samin }   { samin }
          }
          { error }
      }
  }

By using \tl_set:Nx instead of \tl_set:Nn you force an exhaustive expansion of the token list before actually assigning it to \l__myclass_lang_tl, and since the functions \str_case_e:nnF and \str_foldcase:e are both expandable, both will work when x-expanded, so in the end the token list will either contain one of the listed languages or error.

Related Question