[Tex/LaTex] How to iterate over a comma separated list

big-listcomma-separated listloops

How can I iterate, loop, process, or otherwise do something, with each element from a comma separated list of things?

This question has been asked perhaps many many times, and the are already lots of very good answers everywhere. However I think it would also be valuable to have a single reference point where all such alternative techniques are explained in detail, with their advantages and disadvantages discussed.

All of plain-TeX, LaTeX, LaTeX3, LuaTeX, Context, etoolbox, pgffor, etc. answers are invited. Please consider explaining, for example:

How does the command/method deal with whitespace around items?
How does the command/method deal with empty/missing items? (e.g. a,b,,c)
How does the command/method deal with a trailing comma?
Can the command/method work of a list stored on a macro?
Is each iteration evaluated globally or on a local group? (i.e., if a command is defined during an iteration, does the command survive the loop?)
Do items get expanded or somehow mangled before being processed?
What about lists where items are key=value or some other kind of pairs?
Does the method work for lists separated with something other than a comma?
Any other thing I should know before deciding to use this method?

Best Answer

The main loop for comma separated lists in LaTeX3 is

\clist_map_inline:nn

The first argument is an explicit list, the second argument tells LaTeX what to do with each item. For instance, we want to print an enumerate environment from the items:

\documentclass{article}
\usepackage{xparse}
\ExplSyntaxOn

\NewDocumentCommand{\makeenumerate}{ m }
 {
  \begin{enumerate}
  \clist_map_inline:nn { #1 } { \item \fbox{##1} }
  \end{enumerate}
 }

\ExplSyntaxOff

\begin{document}

\makeenumerate{a, b ,c d, ,e }

\end{document}

I used \fbox in order to illustrate some of the features:

spaces are stripped on either side of the items;
empty items are ignored (empty means only spaces between commas);
no expansion is performed on the item.

Note that the current item is denoted by #1 and it's literally available, which is not the case with the usual \@for, where the current item is hidden in a macro. In the code above we have to use double ##1 because we're inside a macro definition.

A similar function is \clist_map_function:nN, which has the advantage of being fully expandable (but only in x full expansion, not f). The above example would be

\NewDocumentCommand{\makeenumerate}{ m }
 {
  \begin{enumerate}
  \clist_map_function:nN { #1 } \xyz_make_item:n
  \end{enumerate}
 }

\cs_new_protected:Npn \xyz_make_item:n #1
 {
  \item \fbox { #1 }
 }

In this case the current item is passed as an argument to the indicated function.

List mappings can be broken; let's say, in the above example, we want to stop processing if an item is \stop:

\documentclass{article}
\usepackage{xparse}
\ExplSyntaxOn

\NewDocumentCommand{\makeenumerate}{ m }
 {
  \begin{enumerate}
  \xyz_make_items:n { #1 }
  \end{enumerate}
 }
\cs_new_protected:Npn \xyz_make_items:n #1
 {
  \clist_map_inline:nn { #1 }
   {
    \tl_if_eq:nnTF { ##1 } { \stop }
     {
      \clist_map_break:
     }
     {
      \item \fbox { ##1 }
     }
   }
 }
\ExplSyntaxOff

\begin{document}

\makeenumerate{a, b ,c d, \stop ,e }

\end{document}

enter image description here

Note that complex code shouldn't be used in \NewDocumentCommand, so I defined an auxiliary function for this purpose.

One can also use

\clist_map_break:n

and the argument given to this function will be executed before breaking the mapping.

The same features apply when using

\keys_set:nn { <module> } { <comma list of key-value pairs> }

for evaluating a set of key-value pairs: leading and trailing spaces are ignored as are empty (blank) items.

If the comma separated list is stored in a macro, one can use

\clist_map_inline:Nn
\clist_map_function:NN

with the same ideas. In my opinion, it's bad programming style allowing both inputs and a variant should be defined.

\documentclass{article}
\usepackage{xparse}
\ExplSyntaxOn

\NewDocumentCommand{\makeenumerate}{ sm }
 {
  \begin{enumerate}
  \IfBooleanTF{#1}
   {
    \clist_set:NV \l_xyz_input_clist #2
    \xyz_make_items:V \l_xyz_input_clist
   }
   {
    \xyz_make_items:n { #2 }
   }
  \end{enumerate}
 }

\clist_new:N \l_xyz_input_clist
\cs_new_protected:Npn \xyz_make_items:n #1
 {
  \clist_map_inline:nn { #1 }
   {
    \tl_if_eq:nnTF { ##1 } { \stop }
     {
      \clist_map_break:
     }
     {
      \item \fbox { ##1 }
     }
   }
 }
\cs_generate_variant:Nn \xyz_make_items:n { V }
\ExplSyntaxOff

\begin{document}

\newcommand{\mylist}{A, B ,C,}

\makeenumerate{a, b ,c d, \stop ,e }

\makeenumerate*{\mylist}

\end{document}

Setting a variable with the contents of the macro holding the comma separated list is done because this process “normalizes” the comma separated list for better usage in \clist_map_inline:Nn.

If other delimiters are desired, the better method is to go to sequences; use \seq_map_inline:Nn or \seq_map_function:NN after splitting the input into components with

\seq_set_split:Nnn \l_xyz_input_seq { ; } { #1 }

Full example:

\documentclass{article}
\usepackage{xparse}
\ExplSyntaxOn

\NewDocumentCommand{\makeenumerate}{ O{,} m }
 {
  \begin{enumerate}
  \xyz_make_items:nn { #1 } { #2 }
  \end{enumerate}
 }

\seq_new:N \l_xyz_input_seq
\cs_new_protected:Npn \xyz_make_items:nn #1 #2
 {
  \seq_set_split:Nnn \l_xyz_input_seq { #1 } { #2 }
  \seq_map_inline:Nn \l_xyz_input_seq
   {
    \tl_if_eq:nnTF { ##1 } { \stop }
     {
      \seq_map_break:
     }
     {
      \item \fbox { ##1 }
     }
   }
 }
\ExplSyntaxOff

\begin{document}

\makeenumerate[;]{a; b ;c, d; \stop ;e }

\end{document}

enter image description here

Also here the splitting ignores leading and trailing spaces and also empty items.

Best Answer

Related Solutions

[Tex/LaTex] Iterating through comma-separated arguments

Related Question