[Tex/LaTex] How to deal with macro expansion and ‘TeX capacity exceeded’ when using write18

errorsexpansionmacrosshell-escape

This issue is driving me nuts. Ever since the friendly and knowledgeable people here on tex SE have guided me to doing multiple string replacements the proper way, I've been trying to incorporate that knowledge into my code — to no available so far. The difficulties are:

  1. The string replacement facility uses \noexpandarg / \expandafter, which does not cooperate with \write18 all too well, and
  2. There is a nasty condition with \write18 and/or \input that causes (La)TeX to throw up a TeX capacity exceeded, sorry message.

Let's look at the M(W)E below — it is not really working as it stands, because I'm using curl to address a local server which is not featured here. For those who are interested: This code is intended for CoffeeXeLaTeX, an attempt to make TeX scriptable with JavaScript (and, therefore, CoffeeScript). You cannot successfully run the code below unless you have a web server at the address specified; I realize that this dependency is wholly fortuitous for the effects I want to demonstrate, and I'm willing to rewrite the example so we can take the brackets out of M(W)E. That said, I'm sure that many of you will just exclaim "Aha! Beginner's mistake!" when reading this code:

\documentclass[a4paper]{article}
\usepackage{xstring}
\usepackage[usenames,dvipsnames,svgnames,table]{xcolor}


% -----------------------------------------------------------------
% URL escaping simplified;
% as per https://tex.stackexchange.com/questions/153215/how-to-do-multiple-string-replacements
\newcommand{\urlescapestep}[2]{%
  \expandafter\StrSubstitute\expandafter{\x}{#1}{#2}[\x]%
  }
\newcommand{\urlescape}[1]{{%
  \noexpandarg%
  \StrSubstitute{#1}{\%}{\%25}[\x]%
  \urlescapestep{/}{\%2F}%
  \urlescapestep{a}{A}% just for this test; imagine useful stuff here
  \x}}

\newcommand{\escapeONE}[1]{*#1*}

\newcommand{\escapeTWO}[1]{\StrSubstitute{#1}{a}{A}}

% -----------------------------------------------------------------
% Execute a command with `\write18`:
\newcommand{\CXtempoutroute}{/tmp/CXtempout.tex}

\newcommand{\exec}[1]{%
  \immediate\write18{#1 > "\CXtempoutroute"}\input{\CXtempoutroute}}

% -----------------------------------------------------------------
% `curl` commands using the string replacement commands:
\newcommand{\curlPlain}[4]{%
  \exec{curl --silent --show-error #1 #2/#3#4}}

\newcommand{\curlUrlescape}[4]{%
  \exec{curl --silent --show-error #1 #2/#3\urlescape{#4}}}

\newcommand{\curlONE}[4]{%
  \exec{curl --silent --show-error #1 #2/#3\escapeONE{#4}}}

\newcommand{\curlTWO}[4]{%
  \exec{curl --silent --show-error #1 #2/#3\escapeTWO{#4}}}

% -----------------------------------------------------------------
\begin{document}

First, let's show our three string escaping mechanisms all work under
normal circumstances:

\verb#\urlescape#: \urlescape{abc}

\verb#\escapeONE#: \escapeONE{abc}

\verb#\escapeTWO#: \escapeTWO{abc}

These do work and turn \verb#abc# into \verb#Abc#, \verb#*abc*#, and \verb#Abc#, respectively.

Now let's use the various \verb#curl*# methods:

works: \verb#\curlPlain{}{127.0.0.1:8910}{foobar.tex/helo/}{abc}#: 
  \curlPlain{}{127.0.0.1:8910}{foobar.tex/helo/}{abc}

works: \verb#\curlONE{}{127.0.0.1:8910}{foobar.tex/helo/}{abc}#: 
  \curlONE{}{127.0.0.1:8910}{foobar.tex/helo/}{abc}

throws: \verb#\curlTWO{}{127.0.0.1:8910}{foobar.tex/helo/}{abc}#: 
  \curlTWO{}{127.0.0.1:8910}{foobar.tex/helo/}{abc}

\end{document}

The code as it stands throws TeX capacity exceeded, sorry [text input levels=15], which would seem to indicate that TeX is recursively calling itself for too many times.

During the past endless hours, I have also managed to get a TeX capacity exceeded, sorry [input stack size=5000], and, significantly, that dreaded use of StrSubstitute doesn't match its definition errors, depending on exactly where I happened to put those magic \edef, \noexpand, \expandafter stuff.

I fully admit that I was doing cargo cult programming, but then I also took out my vintage 1990 Taiwan edition of The TeX Book. I read Bechtolsheim, I searched the internet so often Google wants to send me a golden customer card now. I erased the experimental code and started over so I can post this question with some meaningful code; I could probably try and reconstruct some of the intermittent attempts.

What I found out so far is that when TeX writes stuff to a file, it will postpone that writing because it assumes that "most of the time that is what you'll want" (not sure I can agree with that; why not write \postpone\write18 if you really want to have it work that way?). You can make it act now with \immediate, but since there aint no bait with no hook in TeX, this does not exactly what you'd think it should, it only 'works similar in most cases'.

As a programmer, I'm used to dealing with recursive function calling and delayed, asynchronous code execution in the event loop. Curiously, this knowledge is of little help to me when TeX complains:

  • Where do the recursive calls take place in the above code? I cannot spot them.

  • Why do i sometimes get those stack size exceeded errors for innocuous input files when using \input(i believe)?

  • Why does the presence of \noexpand / \expandafter break \write18?

  • What am I supposed to do against these symptoms (short of reading the implementation code down to the last expansion of every command inside the argument to write18)?

  • Is all this in some way related to the issue of robust / fragile commands and 'moveable arguments'?

I feel justified to ask this convoluted, huge question because my feeling after fiddling for some months with TeX (and managing to produce output, so it's worth the try) I feel that these are the convoluted, shady quarters of TeX where visitors don't like to go, where no easy transport is available, and where only the toughest can make it work at all. Which about sums up my motivation for coming up with CoffeeXeLaTeX.

Best Answer

The \urlescape command in the previous question is only useful for printing a “purified” URL. You need a different version for using it in \write; note that some changes are necessary for producing literal % and not \%.

\documentclass[a5paper]{article}
\usepackage{xstring}

\newcommand{\urlescapestep}[2]{%
  \expandafter\StrSubstitute\expandafter{\x}{#1}{#2}[\x]%
}

% In the following group endlines do not produce spaces
% and `%'  becomes a printable character
\begingroup
\endlinechar=-1
\catcode`\%=12
\gdef\urlescape#1{{
  \noexpandarg
  \StrSubstitute{#1}{%}{%25}[\x]
  \urlescapestep{/}{%2F}
  \urlescapestep{\&}{%26}
  \urlescapestep{ }{%20}
  \urlescapestep{\$}{%24}
  \urlescapestep{+}{%2b}
  \urlescapestep{,}{%2c}
  \urlescapestep{:}{%3a}
  \urlescapestep{;}{%3b}
  \urlescapestep{?}{%3f}
  \urlescapestep{@}{%40}
  \urlescapestep{"}{%22}
  \urlescapestep{<}{%3c}
  \urlescapestep{>}{%3e}
  \urlescapestep{\#}{%23}
  \urlescapestep{\{}{%7b}
  \urlescapestep{\}}{%7d}
  \urlescapestep{|}{%7c}
  \urlescapestep{\^}{%5e}
  \urlescapestep{\~}{%7e}
  \urlescapestep{[}{%5b}
  \urlescapestep{]}{%5d}
  \urlescapestep{\`}{%60}
  \global\let\urlescapecurrent=\x}}
\endgroup

\newcommand{\curlUrlescape}[4]{%
  \urlescape{#4}%
  \exec{curl --silent --show-error #1 #2/#3\urlescapecurrent}%
}

A temporary version just for testing
\newcommand{\exec}[1]{%
  \typeout{I would use^^J^^J%
  #1%
  ^^J^^Jin \string\immediate\string\write18}%
}

\begin{document}

% a foolish URL, just for testing
\curlUrlescape{}{127.0.0.1:8910}{foobar.tex/helo/}{abc|][}

\end{document}

Here's what's printed in the log file:

I would use

curl --silent --show-error  127.0.0.1:8910/foobar.tex/helo/abc%7c%5d%5b

in \immediate\write18