[Tex/LaTex] General text/string manipulation from within LaTeX

text manipulation

Is there a way to easily code from within (La)TeX some text manipulation stuff, like "find" and "replace", so that I may automatize tedious manual labor ?

(EDIT It was mentioned in the comments by Christian Hupfer that l3regex might be a solution, covering the constraints mentioned below – although David Carlisle said "no". Which one is it ?)

There are some important constraints though:

  • The approach has to be general, i.e. only using macros won't work, because I may want to do things like replace all occurences of "=" with ">".

  • I don't want to use LuaTeX (where I have heard that this can be done easily)

  • Coding should ideally by done in the preamble. As indicate here this can be done easily via the xstring package, but that has the disadvantage that if I want to search around the whole text (which can be quite large) I have to enclose everything in a \StrSubstitute[0] which seems an ugly approach and requires me to mess with the content of my document which I'd rather leave untouched.

  • At the very least I should be able to do text replacement, but I'd hope for some more advanced capabilities, meaning to have available at least a subset of the capabilities, regarding text manipulation, of a linux scripting language like sed

Best Answer

You've stated that you "may want to do things like replace all occurrences of = with >" and also that "[c]oding should ideally by done in the preamble".

I'm going to keep my fingers crossed that you'll reconsider the decision not to use LuaLaTeX. Lua (the programming language) has a very flexible and powerful string library, and LuaTeX offers several ways to assign Lua-coded functions to various "callbacks" -- meeting your requirement that the coding should be all done in the preamble. In the following example, the function eq2gt (which, as its name suggests, replaces all instances of = with >) is assigned to the process_input_buffer callback, which operates at a very early stage of processing, viz., before TeX's "eyes" start their processing. That way, the eq2gt function can act as a pre-processor, modifying parts of the input file "on the fly" before the typesetting job itself commences.

% !TEX TS-program = lualatex
\documentclass{article}

%% Lua-side code
\usepackage{luacode}
\begin{luacode}
function eq2gt ( buff )
   return ( string.gsub ( buff , "=" , ">" ) )
end
\end{luacode}

%% TeX-side code
\AtBeginDocument{\luadirect{luatexbase.add_to_callback(
   "process_input_buffer" , eq2gt , "eq2gt" )}}

\begin{document} 
\[
1+1+1=2    % not correct...
\]

$1-1-1=-2$ % not correct either...
\end{document}