[Tex/LaTex] Looping over strings

loopsmacrosstrings

Having fun with strings, I found three different macros to loop over a string character by character.
However, I am not very sure how they work exactly.
Can somebody explain the mechanism of each of them?
Which is the most "correct" one?
Is there some other way to construct this macro?

I have changed the original ones so that they all look as similar as possible.

Macro based on one by Tarass:

This is the easiest one to understand for me.
Since the arguments are delimited, the first one is the first letter and the rest is reintroduced into the loop.
The looping works from within the \if statement.

Macro based on one by David Carlisle:

This one is trickier.
The looping comes from inside the \if statement, but not really.
If the \expandafter is eliminated, it does not work.
If the \xloop is placed behind the \fi it does not work either.
So what is happening exactly?

The rest is easier, every time \xloop is executed, it finds the remaining part of the string and takes the first character.

Macro based on one by Florent:

This one is still more complicated (for me, at least).
The looping works from outside the \if statement, but the characters are accessed one by one as before.
Surprisingly, here the spaces are not discarded.
This can be achieved also in the previous macros with \obeyspaces.

\documentclass{article}

\begin{document}

\subsection*{Macro based on one by Tarass:}

\def\xloop<#1#2>{%
  \ifx\relax#1
    \else
      (#1)\xloop<#2>%
  \fi}  
\def\markletters#1{\xloop<#1\relax>}

\markletters{Hello World!}

\subsection*{Macro based on one by David Carlisle:}

\def\xloop#1{%
  \ifx\relax#1
    \else
      (#1)\expandafter\xloop%
  \fi}
\def\markletters#1{\xloop#1\relax}%

\markletters{Hello World!}


\subsection*{Macro based on one by Florent:}

\def\gobblechar{\let\xchar= }
\def\assignthencheck{\afterassignment\xloop\gobblechar}
\def\xloop{%
  \ifx\relax\xchar
      \let\next=\relax
    \else
      (\xchar)\let\next=\assignthencheck
  \fi
  \next}
\def\markletters#1{\assignthencheck#1\relax}

\markletters{Hello World!}


\end{document}

enter image description here

Best Answer

If I modify your test a bit to make a shorter argument for tracing

\documentclass{article}

\def\test#1{{
\tracingonline=1
\tracingmacros=1
\markletters{#1}
}
\typeout{TYPEOUT: \markletters{#1}}
}
\begin{document}

\subsection*{Macro based on one by Tarass:}

\def\xloop<#1#2>{%
  \ifx\relax#1
    \else
      (#1)\xloop<#2>%
  \fi}  
\def\markletters#1{\xloop<#1\relax>}

\test{a bc}

\subsection*{Macro based on one by David Carlisle:}

\def\xloop#1{%
  \ifx\relax#1
    \else
      (#1)\expandafter\xloop%
  \fi}
\def\markletters#1{\xloop#1\relax}%

\test{a bc}


\subsection*{Macro based on one by Florent:}

\def\gobblechar{\let\xchar= }
\def\assignthencheck{\afterassignment\xloop\gobblechar}
\def\xloop{%
  \ifx\relax\xchar
      \let\next=\relax
    \else
      (\xchar)\let\next=\assignthencheck
  \fi
  \next}
\def\markletters#1{\assignthencheck#1\relax}

\test{a bc}


\end{document}

then the first test produces

\markletters #1->\xloop <#1\relax >
#1<-a bc

\xloop <#1#2>->\ifx \relax #1 \else (#1)\xloop <#2>\fi 
#1<-a
#2<- bc\relax 

\@nobreakfalse ->\global \let \if@nobreak \iffalse 

\xloop <#1#2>->\ifx \relax #1 \else (#1)\xloop <#2>\fi 
#1<-b
#2<-c\relax 

\xloop <#1#2>->\ifx \relax #1 \else (#1)\xloop <#2>\fi 
#1<-c
#2<-\relax 

\xloop <#1#2>->\ifx \relax #1 \else (#1)\xloop <#2>\fi 
#1<-\relax 
#2<-
TYPEOUT: (a)(b)(c) 

Here you see that the macro uses a delimited argument so the entire list is grabbed each time (all tokens up to > ) and the first token is handled, with the remaining tokens being re-inserted in the recursive call.

  1. as #1 works as a normal non-delimited argument it always drops spaces.
  2. As the whole thing works by expansion it works in expansion only contexts such as \write so you get TYPEOUT: (a)(b)(c)
  3. At each stage there is a \fi inserted after the loop so if you have 1000 entries there will be 1000 of these, and at some point you will over-fill the input stack.

The second block produces

\markletters #1->\xloop #1\relax 
#1<-a bc

\xloop #1->\ifx \relax #1 \else (#1)\expandafter \xloop \fi 
#1<-a

\@nobreakfalse ->\global \let \if@nobreak \iffalse 

\xloop #1->\ifx \relax #1 \else (#1)\expandafter \xloop \fi 
#1<-b

\xloop #1->\ifx \relax #1 \else (#1)\expandafter \xloop \fi 
#1<-c

\xloop #1->\ifx \relax #1 \else (#1)\expandafter \xloop \fi 
#1<-\relax 
TYPEOUT: (a)(b)(c) 

Here you can see that after the first macro the inner macro does not grab the whole list, but just the first token. Doing it this way avoids reparsing teh list, and overloading the input stack, but you need to expand the \fi (to nothing) before doing the recursive call as you do not have the possibility of putting the \fi after the list as in the first version. hence the \expandafter which forces \fi to expand before \xloop.

The third version produces

\markletters #1->\assignthencheck #1\relax 
#1<-a bc

\assignthencheck ->\afterassignment \xloop \gobblechar 

\gobblechar ->\let \xchar = 

\xloop ->\ifx \relax \xchar \let \next =\relax \else (\xchar )\let \next =\assi
gnthencheck \fi \next 

\@nobreakfalse ->\global \let \if@nobreak \iffalse 

\next ->\afterassignment \xloop \gobblechar 

\gobblechar ->\let \xchar = 

\xloop ->\ifx \relax \xchar \let \next =\relax \else (\xchar )\let \next =\assi
gnthencheck \fi \next 

\next ->\afterassignment \xloop \gobblechar 

\gobblechar ->\let \xchar = 

\xloop ->\ifx \relax \xchar \let \next =\relax \else (\xchar )\let \next =\assi
gnthencheck \fi \next 

\next ->\afterassignment \xloop \gobblechar 

\gobblechar ->\let \xchar = 

\xloop ->\ifx \relax \xchar \let \next =\relax \else (\xchar )\let \next =\assi
gnthencheck \fi \next 

\next ->\afterassignment \xloop \gobblechar 

\gobblechar ->\let \xchar = 

\xloop ->\ifx \relax \xchar \let \next =\relax \else (\xchar )\let \next =\assi
gnthencheck \fi \next 
! Undefined control sequence.

Here the item is grabbed by a \let assignment, this has the advantage of seeing space tokens, but as it does not work by expansion it fails in the \typeout.