How to make an e-TeX WebAssembly with Jim Fowler's WEB/TeX pascal to WASM compiler web2js?
You're increase of the pool size lead to additional memory requirements. So you do not need any other changes to eTeX, you have to increase the provided memory. In your Javascript versions, the amount of memory is set in the "compiler".
For your settings you would need 32906
pages of memory, but there is an impmentation limit at 32767
pages. Luckily you can avoid this problem by using smaller values.
So we need to change some of the constants form etex.web
.
This doesn't mean that your etex.ch
is "wrong" and you need a "right" one.
Actually the license of etex.ch
would forbid such modifications(At least without changing the name).
Instead you should write a system dependent etex.sys
file which you can pass to tangle
later.
So first get copies from tex.web
and etex.ch
, then run
tie -m etex.web tex.web etex.ch
to get etex.web
. Now you need a changefile with you new constants, for example save the following as etex.sys
:
eTeX compatible constants for web2js
@x
@<Constants...@>=
@!mem_max=30000; {greatest index in \TeX's internal |mem| array;
must be strictly less than |max_halfword|;
must be equal to |mem_top| in \.{INITEX}, otherwise |>=mem_top|}
@!mem_min=0; {smallest index in \TeX's internal |mem| array;
must be |min_halfword| or more;
must be equal to |mem_bot| in \.{INITEX}, otherwise |<=mem_bot|}
@!buf_size=500; {maximum number of characters simultaneously present in
current lines of open files and in control sequences between
\.{\\csname} and \.{\\endcsname}; must not exceed |max_halfword|}
@!error_line=72; {width of context lines on terminal error messages}
@!half_error_line=42; {width of first lines of contexts in terminal
error messages; should be between 30 and |error_line-15|}
@!max_print_line=79; {width of longest text lines output; should be at least 60}
@!stack_size=200; {maximum number of simultaneous input sources}
@!max_in_open=6; {maximum number of input files and error insertions that
can be going on simultaneously}
@!font_max=75; {maximum internal font number; must not exceed |max_quarterword|
and must be at most |font_base+256|}
@!font_mem_size=20000; {number of words of |font_info| for all fonts}
@!param_size=60; {maximum number of simultaneous macro parameters}
@!nest_size=40; {maximum number of semantic levels simultaneously active}
@!max_strings=3000; {maximum number of strings; must not exceed |max_halfword|}
@!string_vacancies=8000; {the minimum number of characters that should be
available for the user's control sequences and font names,
after \TeX's own error messages are stored}
@!pool_size=32000; {maximum number of characters in strings, including all
error messages and help texts, and the names of all fonts and
control sequences; must exceed |string_vacancies| by the total
length of \TeX's own strings, which is currently about 23000}
@!save_size=600; {space for saving values outside of current group; must be
at most |max_halfword|}
@!trie_size=8000; {space for hyphenation patterns; should be larger for
\.{INITEX} than it is in production versions of \TeX}
@!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns}
@!dvi_buf_size=800; {size of the output buffer; must be a multiple of 8}
@!file_name_size=40; {file names shouldn't be longer than this}
@!pool_name='TeXformats:TEX.POOL ';
{string of length |file_name_size|; tells where the string pool appears}
@.TeXformats@>
@ Like the preceding parameters, the following quantities can be changed
at compile time to extend or reduce \TeX's capacity. But if they are changed,
it is necessary to rerun the initialization program \.{INITEX}
@.INITEX@>
to generate new tables for the production \TeX\ program.
One can't simply make helter-skelter changes to the following constants,
since certain rather complex initialization
numbers are computed from them. They are defined here using
\.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to
emphasize this distinction.
@d mem_bot=0 {smallest index in the |mem| array dumped by \.{INITEX};
must not be less than |mem_min|}
@d mem_top==30000 {largest index in the |mem| array dumped by \.{INITEX};
must be substantially larger than |mem_bot|
and not greater than |mem_max|}
@y
@<Constants...@>=
@!mem_max=200000; {greatest index in \TeX's internal |mem| array;
must be strictly less than |max_halfword|;
must be equal to |mem_top| in \.{INITEX}, otherwise |>=mem_top|}
@!mem_min=0; {smallest index in \TeX's internal |mem| array;
must be |min_halfword| or more;
must be equal to |mem_bot| in \.{INITEX}, otherwise |<=mem_bot|}
@!buf_size=5000; {maximum number of characters simultaneously present in
current lines of open files and in control sequences between
\.{\\csname} and \.{\\endcsname}; must not exceed |max_halfword|}
@!error_line=72; {width of context lines on terminal error messages}
@!half_error_line=42; {width of first lines of contexts in terminal
error messages; should be between 30 and |error_line-15|}
@!max_print_line=79; {width of longest text lines output; should be at least 60}
@!stack_size=1000; {maximum number of simultaneous input sources}
@!max_in_open=6; {maximum number of input files and error insertions that
can be going on simultaneously}
@!font_max=75; {maximum internal font number; must not exceed |max_quarterword|
and must be at most |font_base+256|}
@!font_mem_size=20000; {number of words of |font_info| for all fonts}
@!param_size=60; {maximum number of simultaneous macro parameters}
@!nest_size=40; {maximum number of semantic levels simultaneously active}
@!max_strings=60000; {maximum number of strings; must not exceed |max_halfword|}
@!string_vacancies=300000; {the minimum number of characters that should be
available for the user's control sequences and font names,
after \TeX's own error messages are stored}
@!pool_size=350000; {maximum number of characters in strings, including all
error messages and help texts, and the names of all fonts and
control sequences; must exceed |string_vacancies| by the total
length of \TeX's own strings, which is currently about 23000}
@!save_size=600; {space for saving values outside of current group; must be
at most |max_halfword|}
@!trie_size=8000; {space for hyphenation patterns; should be larger for
\.{INITEX} than it is in production versions of \TeX}
@!trie_op_size=500; {space for ``opcodes'' in the hyphenation patterns}
@!dvi_buf_size=800; {size of the output buffer; must be a multiple of 8}
@!file_name_size=40; {file names shouldn't be longer than this}
@!pool_name='TeXformats:TEX.POOL ';
{string of length |file_name_size|; tells where the string pool appears}
@.TeXformats@>
@ Like the preceding parameters, the following quantities can be changed
at compile time to extend or reduce \TeX's capacity. But if they are changed,
it is necessary to rerun the initialization program \.{INITEX}
@.INITEX@>
to generate new tables for the production \TeX\ program.
One can't simply make helter-skelter changes to the following constants,
since certain rather complex initialization
numbers are computed from them. They are defined here using
\.{WEB} macros, instead of being put into \PASCAL's |const| list, in order to
emphasize this distinction.
@d mem_bot=0 {smallest index in the |mem| array dumped by \.{INITEX};
must not be less than |mem_min|}
@d mem_top==200000 {largest index in the |mem| array dumped by \.{INITEX};
must be substantially larger than |mem_bot|
and not greater than |mem_max|}
@z
@x
@d min_quarterword=0 {smallest allowable value in a |quarterword|}
@d max_quarterword=255 {largest allowable value in a |quarterword|}
@d min_halfword==0 {smallest allowable value in a |halfword|}
@d max_halfword==65535 {largest allowable value in a |halfword|}
@y
@d min_quarterword=0 {smallest allowable value in a |quarterword|}
@d max_quarterword=255 {largest allowable value in a |quarterword|}
@d min_halfword==0 {smallest allowable value in a |halfword|}
@d max_halfword==16777215 {largest allowable value in a |halfword|}
@z
Now you can run tangle
:
tangle -underline etex.web etex.sys
You get the files etex.p
and etex.pool
.
Of course web2js
will still look for tex.pool
, but you can just change
filename = "tex.pool";
into
filename = "etex.pool";
in both header.js
and library.js
.
Now let's try
node compile.js etex.p
Similar to your original experiment, we get
[...]
Need 41 of memory
Now 41
is significantly less than 32906
, especially it is below 32767
. So we can just allocate more memory. This needs to be done consistently in four files: In index.js
, initex.js
, tex.js
and pascal/program.js
, change
var pages = 20;
into
var pages = 50;
(Probably 41 would be enough, but 50 looks nicer)
Now we can try
node compile.js etex.p
again. This time it actually works! You could use node initex.js
now to get plain-TeX format, but we actually want eTeX. So you can get yourself a version of etex.src
, etexdefs.lib
and language.def
and change
library.setInput("\nplain \\dump\n\n"
in initex.js
into
library.setInput("\n*etex \\dump\n\n"
Here, the asterisk *
is important, it enables the "extended mode".
Also change &plain
into &etex
in the same file to preload etex
.
Then
node initex.js
generates a e-TeX
format etex.fmt
and a memory dump, which can be used with
node tex.js
I managed to get a LaTeX format working with web2js
, though with some caveats.
Here's a working (for me) sequence of steps.
Get web2js: either download the zip file and unzip, or run
git clone https://github.com/kisonecat/web2js.git
Get
tex.web
: download using your browser, or run:wget http://mirrors.ctan.org/systems/knuth/dist/tex/tex.web
Get
etex.ch
: download using your browser, or run:wget -O etex.ch 'https://tug.org/svn/texlive/trunk/Build/source/texk/web2c/etexdir/etex.ch?revision=32727&view=co'
Tie them together:
tie -m mytex.web tex.web etex.ch
Make the following modifications to the resulting file (or you can use the “proper” way involving
etex.sys
etc., as in the answer by Marcel Krüger):@!mem_max=30000; {greatest index in \TeX's | @!mem_max=400000; {greatest index in \TeX' @!stack_size=200; {maximum number of simul | @!stack_size=1000; {maximum number of simu @!max_in_open=6; {maximum number of input | @!max_in_open=15; {maximum number of input @!max_strings=3000; {maximum number of str | @!max_strings=60000; {maximum number of st @!string_vacancies=8000; {the minimum numb | @!string_vacancies=300000; {the minimum nu @!pool_size=32000; {maximum number of char | @!pool_size=350000; {maximum number of cha @!trie_size=8000; {space for hyphenation p | @!trie_size=600000; {space for hyphenation @!trie_op_size=500; {space for ``opcodes'' | @!trie_op_size=10000; {space for ``opcodes @d mem_top==30000 {largest index in the |m | @d mem_top==400000 {largest index in the | @d hash_size=2100 {maximum number of contr | @d hash_size=15000 {maximum number of cont @d hyph_size=307 {another prime; the numbe | @d hyph_size=2003 {another prime; the numb for i:=0 to @'37 do xchr[i]:=' '; | for i:=0 to @'37 do xchr[i]:=chr(i); for i:=@'177 to @'377 do xchr[i]:=' '; | for i:=@'177 to @'377 do xchr[i]:=chr(i); @d max_quarterword=255 {largest allowable | @d max_quarterword=65535 {largest allowabl @d max_halfword==65535 {largest allowable | @d max_halfword==16777215 {largest allowab
These were determined most empirically, by bumping up the ones I got errors about. The change in the
xchr
assignments is as per the discussion at another question.Correspondingly, edit the four files
index.js
,initex.js
,pascal/program.js
andtex.js
to changevar pages = 20;
tovar pages=290;
. (Actually, while playing with this I created a filecommonMemory.js
containingmodule.exports = { commonPages: function() { return 290; } };
and used
var pages = require('./commonMemory').commonPages();
or..
. But that was just convenient while determining this number 290, and you don't have to do that.)Edit
library.js
: inside functionreset
, change this block:files.push({ filename: filename, position: 0, descriptor: fs.openSync(filename,'r'), });
to
let basename = filename.slice(filename.lastIndexOf('/') + 1); const {spawnSync} = require('child_process'); let realFilename = spawnSync('kpsewhich', [filename]).stdout.toString().trim(); if (realFilename == '') { // try again with basename realFilename = spawnSync('kpsewhich', [basename]).stdout.toString().trim(); if (realFilename == '') { // Give up, just create empty file spawnSync('touch', [basename]); realFilename = basename; console.log(`For filename #${filename}# created empty #${basename}#`); } else { console.log(`Found filename #${filename}# via basename at #${realFilename}#`); } } else { console.log(`Found filename #${filename}# at #${realFilename}#`); } files.push({ filename: filename, position: 0, descriptor: fs.openSync(realFilename,'r'), });
— the idea is that as creating a LaTeX format file loads zillions of files, some of which aren't even distributed with TeX Live, we make the file-lookup hook into
kpsewhich
to find all those files, and just leave the file empty if not found. For what it's worth, these were the files that were not found and for which empty files were used:babel-latex.cfg
,il2enc.dfu
,omlenc.dfu
,omxenc.dfu
,uenc.dfu
.Edit
initex.js
to dump the LaTeX format instead of plain (and again when doing the core dump):-library.setInput("\nplain \\dump\n\n", +library.setInput("\n*latex.ltx \\dump\n\n",
and
-library.setInput("\n&plain\n\n", +library.setInput("\n&latex\n\n",
Replace the contents of
sample.tex
with a LaTeX sample. For example, you can use (from here):\documentclass{article} \title{Cartesian closed categories and the price of eggs} \author{Jane Doe} \date{September 1994} \begin{document} \maketitle Hello world! \end{document}
Get web2js dependencies and build its Pascal parser:
npm install npm run-script build
Build everything: from WEB (via TANGLE) to Pascal (via web2js) to WASM to loading and dumping format file and memory dump and then running TeX:
tangle -underline mytex.web && \ mv -f mytex.pool tex.pool && \ node compile.js mytex.p && \ node initex.js && \ node tex.js
Note that sample.dvi
has been created successfully and looks ok. So we have a working LaTeX format. You can try editing sample.tex
and re-running node tex.js
to typeset various LaTeX documents (to DVI).
Caveats:
Because of those missing files that were substituted, it's possible that hyphenation patterns for non-English languages, or those particular font encodings, may not work correctly. But I could not find these files even in the TeX Live sources so I'm not sure what they're supposed to contain, or whether they're expected to be empty anyway.
The first revision of this answer has a way to build a LaTeX format without increasing
max_quarterword
/max_halfword
, or increasing the number of memory pages granted on the JS side. That came at the cost of not loading most languages' hyphenation patterns, and also is not sufficient for loading heavy-weight packages like TikZ. The current revision does not have those issues.