[Tex/LaTex] How to make an e-TeX WebAssembly with Jim Fowler’s WEB/TeX pascal to WASM compiler web2js

e-texinitextex-coretexlive-2019web

I have a TeX Live 2019 distribution installed on Windows 10 and want to run a preloaded LaTeX based on e-TeX (with these packages among others: Calculator, Calculus, TikZ, CircuiTikZ) under WebAssembly in a web browser.

For the job I found TikZJax, which works as follows (quoted from readme.md by Jim Fowler kisonecat/tikzjax):

How does this work?

Using https://github.com/kisonecat/web2js the Pascal source of tex is
compiled to WebAssembly; the latex format is loaded (without all the
hyphenation data), and

\documentclass[margin=0pt]{standalone}
\def\pgfsysdriver{pgfsys-ximera.def}
\usepackage{tikz}

is executed. Then core is dumped; the resulting core is compressed,
and by reloading the dumped core in the browser, it is possible to
very quickly get to a point where TikZ can be executed. By using an
SVG driver for PGF along with https://github.com/kisonecat/dvi2html
the DVI output is converted to an SVG.

All of this happens in the browser.

I did the following steps according to instructions from Web2JS:

  1. download a clean copy of the TeX WEB sources; output: tex.web
  2. produce the Pascal source by tangle -ing, but with this changed version: tangle -underline tex.web etex.ch thanks to ShreevatsaR's tip; output: tex.p tex.pool after renaming from etex to tex
  3. compile the tex.p" to get the WebAssembly binary; output: out.wasm
  4. produce plain.fmt and a corresponding memory dump with a JavaScript named initex.js; input: out.wasm, plain.tex; output: core.dump, plain.fmt, plain.log, texput.log
  5. compile sample.tex; input: core.dump; output: sample.dvi, sample.log

I can't figure out how to make etex.ch right containing all changes for an eTeX built in Pascal (and running in WebAssembly).

I am not able to compile tex.p (which is actually an etex.p) with web2js to get the WebAssembly binary out.wasm.

I learned that in etex.ch are several changes missing, e.g. memory management.

Here`s an error as follows from a compiling try:

c:\texlive\eTeX\web2js\node_modules\binaryen\index.js:7
if(t){v=__dirname+"/";var ba,ca;a.read=function(c,e){var g=w(c);
g||(ba||(ba=require("fs")),ca||(ca=require("path")),
c=ca.normalize(c),g=ba.readFileSync(c)); return e?g:g.toString()};
a.readBinary=function(c){c=a.read(c,!0);c.buffer||
(c=new Uint8Array(c));assert(c.buffer);return c};1<process.argv.length&&
(a.thisProgram=process.argv[1].replace(/\\/g,"/"));
a.arguments=process.argv.slice(2);
process.on("uncaughtException",function(c){if(!(c instanceof x))
throw c;});process.on("unhandledRejection",y);a.quit=
                                                                                                                                                                                                                                                                                                                                                                                                                                                                    ^
Need 32906 of memory

For this error to get I changed the following constants in tex.web before tangle-ing:

  • max_strings 3000 -> 500000
  • string_vacancies 8000 -> 90000
  • pool_size 32000 -> 6250000
  • max_halfword 65535 -> 268435455
  • mem_max 30000 -> 268435455
  • buf_size 500 -> 200000
  • stack_size 200 -> 5000
  • mem_top 3000 -> 268435455

Without these changes I get this error:

! You have to increase POOLSIZE.

How do I make the etex.ch right?

Update 04.08.2019


Thanks to Marcel Krüger's etex.sys I can now create a plain e-TeX without any problems.

Annotations on this very valuable answer:
– WebAssembly memory page size: 64 KiB i.e. 65536 Bytes [1]
– WebAssembly memory implementation limit: 2GB (as of today) => 32767 pages [2]


1 Allow providing more initial memory than specified by the module #540
2 Can not set TOTAL_MEMORY greater than 2Gb or expand the memory to greater than 2Gb

Best Answer

You're increase of the pool size lead to additional memory requirements. So you do not need any other changes to eTeX, you have to increase the provided memory. In your Javascript versions, the amount of memory is set in the "compiler". For your settings you would need 32906 pages of memory, but there is an impmentation limit at 32767 pages. Luckily you can avoid this problem by using smaller values.

So we need to change some of the constants form etex.web. This doesn't mean that your etex.ch is "wrong" and you need a "right" one. Actually the license of etex.ch would forbid such modifications(At least without changing the name). Instead you should write a system dependent etex.sys file which you can pass to tangle later.

So first get copies from tex.web and etex.ch, then run

tie -m etex.web tex.web etex.ch

to get etex.web. Now you need a changefile with you new constants, for example save the following as etex.sys:

eTeX compatible constants for web2js

@x
@<Constants...@>=
@!mem_max=30000; {greatest index in \TeX's internal |mem| array;
  must be strictly less than |max_halfword|;
  must be equal to |mem_top| in \.{INITEX}, otherwise |>=mem_top|}
@!mem_min=0; {smallest index in \TeX's internal |mem| array;
  must be |min_halfword| or more;
  must be equal to |mem_bot| in \.{INITEX}, otherwise |<=mem_bot|}
@!buf_size=500; {maximum number of characters simultaneously present in
  current lines of open files and in control sequences between
  \.{\\csname} and \.{\\endcsname}; must not exceed |max_halfword|}
@!error_line=72; {width of context lines on terminal error messages}
@!half_error_line=42; {width of first lines of contexts in terminal
  error messages; should be between 30 and |error_line-15|}
@!max_print_line=79; {width of longest text lines output; should be at least 60}
@!stack_size=200; {maximum number of simultaneous input sources}
@!max_in_open=6; {maximum number of input files and error insertions that
  can be going on simultaneously}
@!font_max=75; {maximum internal font number; must not exceed |max_quarterword|
  and must be at most |font_base+256|}
@!font_mem_size=20000; {number of words of |font_info| for all fonts}
@!param_size=60; {maximum number of simultaneous macro parameters}
@!nest_size=40; {maximum number of semantic levels simultaneously active}
@!max_strings=3000; {maximum number of strings; must not exceed |max_halfword|}
@!string_vacancies=8000; {the minimum number of characters that should be
  available for the user's control sequences and font names,
  after \TeX's own error messages are stored}
@!pool_size=32000; {maximum number of characters in strings, including all
  error messages and help texts, and the names of all fonts and
  control sequences; must exceed |string_vacancies| by the total
  length of \TeX's own strings, which is currently about 23000}
@!save_size=600; {space for saving values outside of current group; must be
  at most |max_halfword|}
@!trie_size=8000; {space for hyphenation patterns; should be larger for
  \.{INITEX} than it is in production versions of \TeX}
@!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns}
@!dvi_buf_size=800; {size of the output buffer; must be a multiple of 8}
@!file_name_size=40; {file names shouldn't be longer than this}
@!pool_name='TeXformats:TEX.POOL                     ';
  {string of length |file_name_size|; tells where the string pool appears}
@.TeXformats@>

@ Like the preceding parameters, the following quantities can be changed
at compile time to extend or reduce \TeX's capacity. But if they are changed,
it is necessary to rerun the initialization program \.{INITEX}
@.INITEX@>
to generate new tables for the production \TeX\ program.
One can't simply make helter-skelter changes to the following constants,
since certain rather complex initialization
numbers are computed from them. They are defined here using
\.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to
emphasize this distinction.

@d mem_bot=0 {smallest index in the |mem| array dumped by \.{INITEX};
  must not be less than |mem_min|}
@d mem_top==30000 {largest index in the |mem| array dumped by \.{INITEX};
  must be substantially larger than |mem_bot|
  and not greater than |mem_max|}
@y
@<Constants...@>=
@!mem_max=200000; {greatest index in \TeX's internal |mem| array;
  must be strictly less than |max_halfword|;
  must be equal to |mem_top| in \.{INITEX}, otherwise |>=mem_top|}
@!mem_min=0; {smallest index in \TeX's internal |mem| array;
  must be |min_halfword| or more;
  must be equal to |mem_bot| in \.{INITEX}, otherwise |<=mem_bot|}
@!buf_size=5000; {maximum number of characters simultaneously present in
  current lines of open files and in control sequences between
  \.{\\csname} and \.{\\endcsname}; must not exceed |max_halfword|}
@!error_line=72; {width of context lines on terminal error messages}
@!half_error_line=42; {width of first lines of contexts in terminal
  error messages; should be between 30 and |error_line-15|}
@!max_print_line=79; {width of longest text lines output; should be at least 60}
@!stack_size=1000; {maximum number of simultaneous input sources}
@!max_in_open=6; {maximum number of input files and error insertions that
  can be going on simultaneously}
@!font_max=75; {maximum internal font number; must not exceed |max_quarterword|
  and must be at most |font_base+256|}
@!font_mem_size=20000; {number of words of |font_info| for all fonts}
@!param_size=60; {maximum number of simultaneous macro parameters}
@!nest_size=40; {maximum number of semantic levels simultaneously active}
@!max_strings=60000; {maximum number of strings; must not exceed |max_halfword|}
@!string_vacancies=300000; {the minimum number of characters that should be
  available for the user's control sequences and font names,
  after \TeX's own error messages are stored}
@!pool_size=350000; {maximum number of characters in strings, including all
  error messages and help texts, and the names of all fonts and
  control sequences; must exceed |string_vacancies| by the total
  length of \TeX's own strings, which is currently about 23000}
@!save_size=600; {space for saving values outside of current group; must be
  at most |max_halfword|}
@!trie_size=8000; {space for hyphenation patterns; should be larger for
  \.{INITEX} than it is in production versions of \TeX}
@!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns}
@!dvi_buf_size=800; {size of the output buffer; must be a multiple of 8}
@!file_name_size=40; {file names shouldn't be longer than this}
@!pool_name='TeXformats:TEX.POOL                     ';
  {string of length |file_name_size|; tells where the string pool appears}
@.TeXformats@>

@ Like the preceding parameters, the following quantities can be changed
at compile time to extend or reduce \TeX's capacity. But if they are changed,
it is necessary to rerun the initialization program \.{INITEX}
@.INITEX@>
to generate new tables for the production \TeX\ program.
One can't simply make helter-skelter changes to the following constants,
since certain rather complex initialization
numbers are computed from them. They are defined here using
\.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to
emphasize this distinction.

@d mem_bot=0 {smallest index in the |mem| array dumped by \.{INITEX};
  must not be less than |mem_min|}
@d mem_top==200000 {largest index in the |mem| array dumped by \.{INITEX};
  must be substantially larger than |mem_bot|
  and not greater than |mem_max|}
@z

@x
@d min_quarterword=0 {smallest allowable value in a |quarterword|}
@d max_quarterword=255 {largest allowable value in a |quarterword|}
@d min_halfword==0 {smallest allowable value in a |halfword|}
@d max_halfword==65535 {largest allowable value in a |halfword|}
@y
@d min_quarterword=0 {smallest allowable value in a |quarterword|}
@d max_quarterword=255 {largest allowable value in a |quarterword|}
@d min_halfword==0 {smallest allowable value in a |halfword|}
@d max_halfword==16777215 {largest allowable value in a |halfword|}
@z

Now you can run tangle:

tangle -underline etex.web etex.sys

You get the files etex.p and etex.pool.

Of course web2js will still look for tex.pool, but you can just change

filename = "tex.pool";

into

filename = "etex.pool";

in both header.js and library.js.

Now let's try

node compile.js etex.p

Similar to your original experiment, we get

[...]

Need 41 of memory

Now 41 is significantly less than 32906, especially it is below 32767. So we can just allocate more memory. This needs to be done consistently in four files: In index.js, initex.js, tex.js and pascal/program.js, change

var pages = 20;

into

var pages = 50;

(Probably 41 would be enough, but 50 looks nicer)

Now we can try

node compile.js etex.p

again. This time it actually works! You could use node initex.js now to get plain-TeX format, but we actually want eTeX. So you can get yourself a version of etex.src, etexdefs.lib and language.def and change

library.setInput("\nplain \\dump\n\n"

in initex.js into

library.setInput("\n*etex \\dump\n\n"

Here, the asterisk * is important, it enables the "extended mode". Also change &plain into &etex in the same file to preload etex.

Then

node initex.js

generates a e-TeX format etex.fmt and a memory dump, which can be used with

node tex.js
Related Question