[Tex/LaTex] Automatic LaTeX parsing to replace text with macros

macros

I know this title is pretty bad but I couldnt come up with something better.
Basically what I want is to be able to write normal text, but that it displays in PDF in a custom alphabet once compiled. This is different than just using a custom font because for example, if I write "la" it should corresponds to one character, and "lo" to another different character.
The point is I want to be able to write it without using ten billion macros myself, for example writing \customLA \customLO instead of la lo would be extremely tedious if I want to write lines and lines.

Is there a simple and automatic way to do it with LaTeX? Or should i write my own parsing program in C++ to modifiy the LaTeX file to replace characters combinations with macros?

Best Answer

This is a simple replacement tool at a Lua level. Everything written in \parseme is a target for replacements, but the TeX commands are not expanded and we can hide a portion of text from replacement procedure. The replacements are stored in a simple Lua table, the first column contains the searched terms and the second column contains their replacements. Testing goes from top to bottom. For instance A is replaced quite soon and never again. On the other hand, X is going to be Y, then Y goes to Z and that goes to A. The name of commands will be replaced, e.g. \colorme would become \cojuicerme. I used the \clrme command to illustrate a fast workaround, if needed.

If you need even more advanced tool for string manipulations than the string library can offer, I can recommend you LPeg library, that's a tool which is already installed in LuaTeX. Some examples are mentioned in Programming in LuaTeX article.

If you run my example you will see 1 2 3 4 5 6 three times in the terminal. It means there were 6 replacement tests called three times by the \parseme command during typesetting.

%! lualatex mal-text-parser.tex
\documentclass[a4paper]{article}
\pagestyle{empty}
\parindent=0pt
% It fixes beginnings and endings of I/O lines, among other things...
\usepackage{luatextra} 
\pagestyle{empty}
\usepackage{luacode}
\usepackage{xcolor}

\begin{document}
\def\parseme#1{% TeX definition...
  \directlua{parseme("\unexpanded{#1}")}%
% This is working: "\unexpanded{#1}" but one must write \\  instead of backslash in the \parseme command.
  }% End of \parseme command...

\begin{luacode*}
-- A conversion table, from -> to
local maltable={
  {"la", "beer"},
  {"lo", "juice"},
  {"A", "Hello World! I was here! Phantom-as!"},
  {"X","Y"},
  {"Y","Z"},
  {"Z","A"},
  }
function parseme(text) -- Lua function
-- Read an argument sent by TeX...
content=text -- Backup of original text...
-- Do all the necessary replacements...
for i=1,#maltable do
tex.sprint("\\message{"..i.."}")
content=string.gsub(content,maltable[i][1],maltable[i][2])
end
-- Typeset the result at the terminal and to the document...
print(content)
tex.print(content)
end
\end{luacode*}

\def\formated{\textbf{lalo}} % This part is not replaced.
\def\clrme#1{{\color{red}#1}} % Definition of the command.
% We use \clrme instead of \colorme because we would get -> \cojuicerme as lo is being replaced...
% lalo in \formated is protected from expansion and replacement...
Text of the paragraph. \parseme{My long \\formated{} sentence. \\clrme{To la red!} \\textbf{Hey!} Ending. lalala lo lo lo A}\par 
End of the paragraph. \parseme{My \\clrme{next} try.} The end.\par
Input is X, result is: \parseme{X}; X goes to Y to Z to A, but A is not replaced anymore.

\end{document}

mwe of the replacement tool

Related Question