[Tex/LaTex] How could the macro xii.tex be simplified into a better readable form

macrostex-core

I was reading some TeX guides from where I have found this source:

\let~\catcode~`76~`A13~`F1~`j00~`P2jdefA71F~`7113jdefPALLF
PA''FwPA;;FPAZZFLaLPA//71F71iPAHHFLPAzzFenPASSFthP;A$$FevP
A@@FfPARR717273F737271P;ADDFRgniPAWW71FPATTFvePA**FstRsamP
AGGFRruoPAqq71.72.F717271PAYY7172F727171PA??Fi*LmPA&&71jfi
Fjfi71PAVVFjbigskipRPWGAUU71727374 75,76Fjpar71727375Djifx
:76jelse&U76jfiPLAKK7172F71l7271PAXX71FVLnOSeL71SLRyadR@oL
RrhC?yLRurtKFeLPFovPgaTLtReRomL;PABB71 72,73:Fjif.73.jelse
B73:jfiXF71PU71 72,73:PWs;AMM71F71diPAJJFRdriPAQQFRsreLPAI
I71Fo71dPA!!FRgiePBt'el@ lTLqdrYmu.Q.,Ke;vz vzLqpip.Q.,tz;
;Lql.IrsZ.eap,qn.i. i.eLlMaesLdRcna,;!;h htLqm.MRasZ.ilk,%
s$;z zLqs'.ansZ.Ymi,/sx ;LYegseZRyal,@i;@ TLRlogdLrDsW,@;G
LcYlaDLbJsW,SWXJW ree @rzchLhzsW,;WERcesInW qt.'oL.Rtrul;e
doTsW,Wk;Rri@stW aHAHHFndZPpqar.tridgeLinZpe.LtYer.W,:jbye

The text preceding the code is something like:

TeX is a macro language and the meaning of existing commands can be changed on
the fly, and also new commands can be dened on the fly [Knu84]. As perhaps the most
extreme example of this is David Carlisle's xii.tex TeX code, which is obtainable as http://mirrors.ctan.org/macros/plain/contrib/xii/xii.tex:

Now my question is, how this so-called extreme example of macro could be simplified to be readable to the beginner TeX/LaTeX users?

I have run the code through TeX and I was really surprised to see the result!

One more basic question I do have:

Could all TeX commands be used from any LaTeX file?

EDIT

With the answer from hendrik-vogt and the related answer from joseph-wright, I was trying to understand the obfuscated code.

TeXifying the code after adding \traceall at the beginning gives perhaps useful information, but still quite stumbling blocks to me is something like:

A71->~`7113\def 
71<-L
{\catcode}
{\def}

That seems to me something like BF. Not very well, but still I am getting some hints what is going on behind the scene. Can someone help me what does the previous block of code means?

Best Answer

The short answer

There are actually three levels of obfuscation in that code: Firstly, some clever recursive macros are used to compress the text to just a few lines. Secondly, some kind of substitution and transposition ciphers are implemented by rather straightforward macros like \defR#1#2#3{#3#2#1} (where R is an active character). The third level is the use of category codes to avoid intellegible TeX-specific characters like {, }, # and \.

Some more explanations

I saw this TeX file more that 10 years ago and thought Eh? So I tried to take it apart step by step. On the way I learned a lot about TeX's macro language, and also about category codes. To begin with, I'll only post the very first steps I took. (I still have the files on my computer, and this one is named hae.1, "hae" being German and meaning "eh?")

\let~\catcode  ~`76  ~`A13  ~`F1  ~`j00  ~`P2
\defA#1{~`#113\def}
ALL{
}
A''{w}      A;;{}    AZZ{LaL}    A//#1{#1i}   AHH{L}
Azz{en}     ASS{th}; A$${ev}     A@@{f}
ARR#1#2#3{#3#2#1};
ADD{Rgni}   AWW#1{}  ATT{ve}     A**{stRsam}  AGG{Rruo}
Aqq#1.#2.{#1#2#1}                AYY#1#2{#2#1#1}
A??{i*Lm}   A&&#1\fi{\fi#1}      AVV{\bigskipR}WG
AUU#1#2#3#4 #5,#6{\par#1#2#3#5D\ifx:#6\else&U#6\fi}L
AKK#1#2{#1l#2#1}
AXX#1{VLnOSeL#1SLRyadR@oL RrhC?yLRurtK{eL}{ov}gaTLtReRomL;}
ABB#1 #2,#3:{\if.#3.\else B#3:\fiX{#1}U#1 #2,#3:}Ws;
AMM#1{#1di}  AJJ{Rdri}  AQQ{RsreL}  AII#1{o#1d}  A!!{Rgie}
Bt'el@ lTLqdrYmu.Q.,Ke;vz vzLqpip.Q.,tz;
;Lql.IrsZ.eap,qn.i. i.eLlMaesLdRcna,;!;h htLqm.MRasZ.ilk,%
s$;z zLqs'.ansZ.Ymi,/sx ;LYegseZRyal,@i;@ TLRlogdLrDsW,@;G
LcYlaDLbJsW,SWXJW ree @rzchLhzsW,;WERcesInW qt.'oL.Rtrul;e
doTsW,Wk;Rri@stW aHAHH{ndZ}pqar.tridgeLinZpe.LtYer.W,:\bye

You see, I replaced F with { and P with } throughout. Why? ~`F1 gives the character F category code 1 (since the active character ~ is \let to \catcode), and TeX understands such characters as opening braces. In the same way, P is given category code 2 by ~`P2, so it acts as a closing brace.

The next thing is to understand ~`76 and ~`j00. The first makes 7 behave like the macro parameter character # (category code 6), and the second makes j behave like the control sequence character \ (category code 0), so I've replaced 7 with # and j with \ throughout. This enhances readability quite a bit already. Moreover, I added some white space and line breaks, which helps some more.

The key point now is to understand what A does. This is an "active character" (category code 13) due to ~`A13, so it behaves like a control sequence. So what does the definition \defA#1{~`#113\def} mean? A takes one argument #1. Then ~ acts as \catcode, so A gives it's argument category code 13, and then it issues an additional \def.

So how does this fit with the output of \tracingall you got? The first line

A71->~`7113\def 

says the following: To the left of -> you see that the active character A takes one argument #1 (recall that 7 acts as #); to the right of -> you see the corresponding expansion. Now the second line

71<-L

says that #1 should be substituted with L, so that the expansion is ~`L13\def. Now ~ was \let to \catcode, so the category code assignment \catcode`L13 is performed next (making L an active character); then the \def is executed. This is what you see in the next two lines:

{\catcode}
{\def}

(Unfortunately \tracingall doesn't say anything specific about the execution of \catcode and \def.)

Let's look at the usage A''{w}. This expands to \catcode`'13\def'{w}, so ' is made an active character, and it's given a definition, namely that ' should expand to w (one of the transposition ciphers)! Just one more example: A??{i*Lm} makes ? active and gives it the definition i*Lm, which in turn expands to istRsam m since * expands to stRsam and L to a space. The final result is istmas m since R acts as a transposition cipher, as mentioned in the very beginning. And now we're able to understand the little piece RrhC?y in the code – it expands to Christmas my!

A challenge

If you remove all the cipher and catcode business, you just have some recursive macros that represent the text rather efficiently. Can anyone do it with less than 479 characters?

\let~\def~\U#1,#2:{\par#1ing \if.#2.\else\U#2:\fi}~\,{\def\,{and
}}~\;#1~#2 #3,#4:{\if.#4.\else\;#4:\fi\bigskip On the #1#2th day of
Christmas my true love gave to me\U#1#3,#4:}~~#1 {}\;twel~f ve drummers
drumm,eleven~ \ pipers pip,ten~ \ lords a leap,nin~ e ladies danc,eigh~
t maids a milk,seven~ \ swans a swimm,six~ \ geese a lay,fi~f ve gold
rings~,four~ \ calling birds~,th~ird\ ~ ree french hens~,~second\ ~ two
turtle doves~,~first\ ~ \,a partridge in a pear tree.~,:\bye
Related Question