pdfTeX is intended to offer complete compatibility with Knuth's TeX, and thus if the e-TeX extensions are not enabled should act in the same way.
XeTeX is based on the e-TeX code and does not set out to break any compatibility with Knuth's TeX unless it is absolutely necessary (i.e. there is no reimplementation of algorithms unless this relates to adding new features). However, there are places that differences occur. As noted in the question, XeTeX can load system fonts. When this is done, a new approach to placing boxes on the page is used. Possibly only those with deep involvement in that code can comment on whether it was absolutely necessary not to support the {}
approach to breaking ligatures, but as noted that doesn't always work anyway. Changes also occur where the classical TeX syntax is extended. For example, allowing more than two ^
is done to allow access to the full Unicode range. However, as shown in http://wiki.contextgarden.net/Encodings_and_Regimes that leads to code which does different things in 8-bit and Unicode TeX engines:
\def\"{0}\expandafter\def\csname^^^^^00022\endcsname{1}
\ifnum\"=0 \message{tex82}\else\message{newstuff}\fi
LuaTeX is a very different case. The designers have decided to revisit a number of Knuth's decisions: the LuaTeX manual covers the detail (there is quite a bit). For example, the fact that {}
does not inhibit a ligature is deliberate and relates to how LuaTeX process the input and represents it in internal data structures. LuaTeX treats hyphenation as a property of language not of font. As such, LuaTeX can hyphenate words using different fonts if the language does not change. As a result, hyphenation is governed by per-language primitives. (LuaTeX can also hyphenate the first word of a paragraph, which Knuth's TeX does not do and which is nowadays a 'feature'.) LuaTeX also has the same issues as XeTeX in terms of extending primitives for Unicode working: see the demo above for example.
Worth noting is that as LuaTeX supports callbacks functionality unchanged by the engine may be altered by Lua code. An obvious example is the \font
primitive. This is not extended by the engine, but is by a Lua-based font loader: plain and LaTeX users share the same code here while ConTeXt has its own (related) loader.
For both XeTeX and LuaTeX it is worth noting that the extension of math mode to allow a number of additional Unicode math parameters to be used (all prefixed \Umath...
) means that math mode spacing may change if these additional data points are available, principally when using a Unicode math font.
The bottom line from all of this is that if you have an 8-bit document written for pdfTeX, e-TeX or indeed TeX90 you should be able to use pdfTeX to process it unchanged. XeTeX will give the same result with almost all files of the same form assuming they don't contain any engine tests or similar, and assuming that the contain no driver-specific code (XeTeX uses the xdvipdfmx driver in all cases, pdfTeX may use dvips, dvipdfmx or direct PDF output). LuaTeX may change the behaviour of such documents, including but not limited to hyphenation, line breaking, ligature formation and so on.
Looking at the question purely in terms of primitives, we have to decide if we are comparing XeTeX and LuaTeX with TeX90, e-TeX or pdfTeX1.40. The question seems to be focussed on 'current' engines, so I will take pdfTeX 1.40 as the 'reference (it incorporates the e-TeX modifications to TeX90 plus a range of additional primitives). As noted in the part above, some behaviours are changes in XeTeX and LuaTeX. I'll note where possible any TeX90/e-TeX/pdfTeX variations which seem important in this context. Quite a bit of this information is available in the LuaTeX manual.
As XeTeX and LuaTeX allow Unicode input, and primitives which are followed by the <number>
of a character are affected by the change:
\char
\lccode
\uccode
\catcode
\sfcode
\efcode
(LuaTeX-only: see below)
\lpcode
\rpcode
\chardef
These all accept the full Unicode range (up to 0x10FFFF) with the newer engines: pdfTeX like e-TeX and TeX90 allows only the 8-bit range (maximum 0xFF).
LuaTeX extends the range of registers allowed beyond that of e-TeX. Thus while pdfTeX and XeTeX allow up to 32767 box, count, dimen, muskip, marks and toks registers, LuaTeX allows a 16-bit range (max is 65535). This affects the primitives
\count
\dimen
\skip
\muskip
\marks
\toks
\countdef
\dimendef
\skipdef
\muskipdef
\toksdef
\box
\unhbox
\unvbox
\copy
\unhcopy
\unvcopy
\wd
\ht
\dp
\setbox
\vsplit
The \font
primitive is extended by XeTeX to allow loading of system fonts with the syntax
\font⟨name⟩="⟨font identifier⟩⟨font options⟩:⟨font features⟩" ⟨TeX font options⟩
where the ⟨font identifier⟩
may be given in square brackets for a file name or without with a 'friendly' (system) name. This is not the case in LuaTeX: as noted above, LuaTeX is normally used with a Lua-based font loader which modifies the primitive via a callback.
LuaTeX allows file names to be given in braces as primitive sytnax, for example
\input{file name}
This affects the primitives
\font
(note: this is purely to do with the file name of the font)
\input
\openin
\openout
pdfTeX adds a number of primitives to e-TeX, some related to PDF creation, some for microtypography and some general utilities. As XeTeX is based directly on e-TeX and not on pdfTeX, it only features some of these where they have been ported across. Some of the primitive are also renamed as they are no PDF-related. Thus XeTeX includes the following concepts introduced by pdfTeX:
\lpcode
\rpcode
\pdfpageheight
\pdfpagewidth
\pdfsavepos
\pdflastxpos
\pdflastypos
\ifincsname
\ifprimitive
(\ifpdfprimitive
in pdfTeX)
\primitive
(\pdfprimitive
in pdfTeX)
\strcmp
(\pdfstrcmp
in pdfTeX`)
\shellescape
(\pdfshellescape
in pdfTeX)
\normaldeviate
(TL'19 onward, \pdfnormaldeviate
in pdfTeX)
\uniformdeviate
(TL'19 onward, \pdfuniformdeviate
in pdfTeX)
\randomseed
(TL'19 onward, \pdfrandomseed
in pdfTeX)
\setrandomseed
(TL'19 onward, \pdfsetrandomseed
in pdfTeX)
\elapsedtime
(TL'19 onward, \pdfelapsedtime
in pdfTeX)
\resettimer
(TL'19 onward, \pdfresettimer
in pdfTeX)
\filedump
(TL'19 onward, \pdffiledump
in pdfTeX)
\filemoddate
(TL'19 onward, \pdffilemoddate
in pdfTeX)
\filesize
(TL'19 onward, \pdffilesize
in pdfTeX)
\mdfivesum
(TL'19 onward, \pdfmdfivesum
in pdfTeX)
but not for example \efcode
(as noted above), \pdfliteral
or many others.
LuaTeX is based on pdfTeX and retains some of the primitives introduced there, renames some to remove 'pdf' and drops others. As well as primitives marked as experimental or deprecated in pdfTeX 1.40, LuaTeX also removes the primitives:
\pdfelapsedtime
\pdfescapehex
\pdfescapename
\pdfescapestring
\pdffiledump
\pdffilemoddate
\pdffilesize
\pdflastmatch
\pdfmatch
\pdfmdfivesum
\pdfresettimer
\pdfshellescape
\pdfstrcmp
\pdfunescapehex
and provides
\primitive
\ifprimitive
\ifabsnum
\ifabsdim
without 'pdf' in the name. It also moves all of the 'back end' concepts (to do with producing PDF output) to three new primitives which implement the functionality of the various PDF-related \pdf...
primitives from pdfTeX.
Currently, XeTeX and pdfTeX use the 'TeX--XeT' model for right-to-left typesetting while LuaTeX uses one derived from Omega/Aleph. As such, it does not feature the primitives
\TeXXeTstate
\beginR
\beginL
\endR
\endL
(Note that there has been suggestion that XeTeX may at some stage move from TeX--XeT to the Omega model.)
LuaTeX also alters the behaviour of \endlinechar
and \newlinechar
: the maximum value is 127 while setting any value below zero stores -1.
Both XeTeX and LuaTeX add new primitives to TeX and the behaviour of these of course requires the appropriate engine. Note in particular that new primitives for Unicode math handling (\Umath...
) are available in both engines. The also both feature \suppressfontnotfounderror
.
The total changes made by Knuth for the 2021 tuneup were huge (there
were several people scrutinizing his work this time around). There were
several small changes from typo corrections in older errata to rewording
of the copying conditions in tex.web
and even some changes to the
plain format itself. Here are the entries added to errorlog.tex
and a
small1 explanation to each.
(The “five bugs” Knuth mentions are I948
,
S949
, S950
, B952
, and R953
. There are other two that were not
added to errorlog.tex
; they are at the bottom of this answer.)
A shorter version of this summary is also now available at the TUG website.
* 15 January 2021
I948. Don't pause on errors when tracing paragraphs (Udo Wermuth). @826
S949. Don't try to interact when in |\batchmode| (Xiaosa Zhang). @83
S950. Don't try to edit when no file is active (Xiaosa Zhang). @84
R951. Take date and time sometimes from system, not user (Udo Wermuth). @241,536
B952. Don't allow implicit left brace after |#| (Udo Wermuth). @476
R953. After nine parameters, must delete offending tokens (Bruno Le Floch). @476
D954. Garbage visible in buffer after file ends prematurely (DRF). @486
R955. Force nonexistent characters to have null specs (DRF). @722
C956. Don't mark fraction noads as temporarily Inner (DRF). @761
Q957. Reset |\newlinechar| before logging the stats (Udo Wermuth). @1333,1335
[1]: I wrote the “small explanation” in the intro before actually
writing the explanations, so I didn't lie... technically :)
I948. Don't pause on errors when tracing paragraphs (Udo Wermuth). @826
This bug would cause TeX to apparently hang when \tracingparagraphs
was on (> 0), and the Infinite glue shrinkage
error occurred
while the paragraph trace
info was being printed. When \tracingparagraphs
, is on, TeX is
writing to log_only
(unless \tracingonline=1
), and in case of
that error it would prompt the user for interaction, but since the write
selector was redirecting the output to the the .log
, the user would
not see that, and TeX would be apparently stuck.
For example, if you run on a file that contains this line:
\tracingparagraphs=1 Press\hss return.\end
the terminal will show
This is TeX, Version 3.14159265 (TeX Live 2020) (preloaded format=tex)
restricted \write18 enabled.
entering extended mode
(./test.tex
and hang there, waiting for user interaction (for example, typing x
then <RETURN>
to end the run, or just <RETURN>
to ignore the
offending \hss
). This will not work if you run
$ tex '\tracingparagraphs=1 Press\hss return.\end'
because in that case there is no .log
open yet, so the output will be
to the terminal.
After the tuneup, TeX will treat this error as it treats any other
error: if in \errorstopmode
it ask the user for interaction, otherwise
it will scroll past it:
This is TeX, Version 3.141592653 (TeX Live 2021/dev) (preloaded format=tex)
(./test.tex
! Infinite glue shrinkage found in a paragraph.
<inserted text> \par
<to be read again>
\end
l.1 \tracingparagraphs=1 Press\hss return.\end
?
The relevant change entry is:
429. Don't echo error message to terminal when tracing paragraphs
(Udo Wermuth, 15 January 2017)
@x module 826
begin no_shrink_error_yet:=false;
@y
begin no_shrink_error_yet:=false;
@!stat if tracing_paragraphs>0 then end_diagnostic(true);@+tats@;
@z
@x
error;
@y
error;
@!stat if tracing_paragraphs>0 then begin_diagnostic;@+tats@;
@z
S949. Don't try to interact when in |\batchmode| (Xiaosa Zhang). @83
This bug, initially
reported here, was
causing TeX to ask for user interaction (TeX's ?
prompt) while in
\batchmode
, thus trying to write to a closed \write
stream,
and this would cause a segmentation fault. From Karl's
answer to the original report, you can reproduce that error in TeX up to
2020 by running tex -ini
then typing these lines, ending them with
<RETURN>
:
\catcode`\^=7 \catcode`\^^?=15 \s^^?E
1
q
v
With the new TeX, after typing q
to enter \batchmode
, TeX won't try
to ask for user interaction if it can't, so that won't break anymore.
The relevant change entry is:
430. Defeat interactions during batch mode (Xiaosa Zhang, 27 June 2020)
@x module 83
@ @<Get user's advice...@>=
loop@+begin continue: clear_for_error_prompt; prompt_input("? ");
@y
@ @<Get user's advice...@>=
loop@+begin continue: if interaction<>error_stop_mode then return;
clear_for_error_prompt; prompt_input("? ");
@z
S950. Don't try to edit when no file is active (Xiaosa Zhang). @84
This bug, initially
reported here was
triggered when you tried to open the editor (using TeX's E
option)
when an error happened in an input given interactively. Suppose you
have a file called h.tex
with a single line (supposing that \ERROR
is undefined, or is anything that would cause an error):
% h.tex
\ERROR
then, when TeX complains about ! Undefined control sequence \ERROR
you
reply:
I\MISTAKE V
which will insert \MISTAKE V
for TeX to process, and it will once
again complain, due to the undefined \MISTAKE
, and now you reply:
E
and TeX will segfault.
Here's the transcript of the interactive session:
$ tex h
This is TeX, Version 3.14159265 (TeX Live 2020) (preloaded format=tex)
(./h.tex
! Undefined control sequence.
l.1 \ERROR
? I\MISTAKE V
! Undefined control sequence.
<insert> \MISTAKE
V
l.1 \ERROR
? E
No pages of output.
Transcript written on h.log.
Segmentation fault (core dumped)
This error would happen because TeX would try to tell you the name of
the input file in which the error occurred, but since the error was on
an interactively input command, there is no associated file. After the
tuneup TeX knows that in that case it is not reading from a file, so it
won't try to give you a file name.
The relevant change entry is:
431. Don't exit to editor if no input file is at the bottom line
(Xiaosa Zhang, 03 July 2020)
@x module 84
"E": if base_ptr>0 then
@y
"E": if base_ptr>0 then if input_stack[base_ptr].name_field>=256 then
@z
@x module 85
if base_ptr>0 then print("E to edit your file,");
@y
if base_ptr>0 then if input_stack[base_ptr].name_field>=256 then
print("E to edit your file,");
@z
R951. Take date and time sometimes from system, not user (Udo Wermuth). @241,536
Before setting \jobname
(more precisely before starting the
.log
file) you could change the value of \year
, \month
, \day
and \time
,
and that would be written in the header line of the .log
. If you did
$ tex '\day=99 \end'
the first line of the .log
would say something like
This is TeX, Version 3.14159265 (TeX Live 2020) (INITEX) 99 FEB 2021 22:18
(note the Feb 99th :) or, if you were feeling really
devious, you could print any three bytes from TeX's executable by
setting a bogus value of \month
, like (with the build I have here)
$ tex -ini "\month=-54 \end"
to get a month called TeX:
This is TeX, Version 3.14159265 (TeX Live 2020) (INITEX) 2 TeX 2021 22:20
or with an extreme enough value you could make TeX crash with a segmentation
fault, for example:
$ tex '\month=-100000 \end'
With the new version, the value printed in the header is an internal
sys_(time|day|month|year)
, that can't be changed by changing the
primitive registers.
This is a rather long (in number of lines) change, and not so
interesting as reading material here (basically declare new variables
sys_<thing>
, initialise the primitives to those, and use sys_<thing>
instead of <thing>
to print the banner), so I will omit the change
entry, but you can find by searching for its header
432. Keep date and time in system variables, use them in opening banner
(Udo Wermuth, 11 December 2020)
in tex82.bug
.
B952. Don't allow implicit left brace after |#| (Udo Wermuth). @476
This bug (which I was surprised it wasn't found before) allowed you,
when the last token of the <parameter text>
of a definition was
#
6, to use an implicit begin-group character
(like \bgroup
) in place of the explicit begin-group character
that marks the end of the <parameter text>
of a definition, such that
\def\foo#1#\bgroup(#1)}
\show\foo
was valid, and would show
> \foo=macro:
#1\bgroup ->(#1)\bgroup .
on the terminal, meaning that the parameter #1
of \foo
was delimited
by \bgroup
, and that \bgroup
would be reinserted after the
<replacement text>
of the macro, exactly how TeX does with an
explicit begin-group character. After the tuneup, you will get an
error from the definition above:
! Parameters must be numbered consecutively.
<to be read again>
\bgroup
l.1 \def\foo#1#\bgroup
(#1)}
?
and further errors that will followed due to the malformed definition
(the macro that will be defined with the input above will be, after some
errors, \foo=macro:#1#2\bgroup (#31)->.
).
The relevant change entry is:
434. Don't accept an implicit left brace after # in macro head
(Udo Wermuth, 20 May 2020)
@x module 476
if cur_cmd=left_brace then
@y
if cur_tok<left_brace_limit then
@z
R953. After nine parameters, must delete offending tokens (Bruno Le Floch). @476
With this bug you could have TeX do some real funny things. When
scanning the <parameter text>
of a macro, after the nine allowed
parameters, any #
will raise an error, but the token following that
#
would be left in the <parameter text>
. Suppose you had a macro
with 9 parameters, and tried to add a tenth parameter #0
:
\def\foo#1#2#3#4#5#6#7#8#9#0{}
\show\foo
TeX would complain to you that
! You already have nine parameters.
l.1 \def\foo#1#2#3#4#5#6#7#8#9#0
{}
? h
I'm going to ignore the # sign you just used.
?
and the macro definition would have the #
ignored, but the 0
would
remain there:
> \foo=macro:
#1#2#3#4#5#6#7#8#90->.
l.2 \show\foo
?
So far, nothing exciting. But now suppose a day you were feeling extra
naughty and used ##
instead of #0
, like
\def\foo#1#2#3#4#5#6#7#8#9##{}
, then \show\foo
would say
> \foo=macro:
#1#2#3#4#5#6#7#8#9##->.
l.2 \show\foo
?
and would you look at that! #9
is now delimited by a parameter token,
so if you called \foo 12345678hello#
, #9
would be hello
!
Even worse, you could trick TeX's scanner into grabbing a }
as the
argument of a macro without errors (after the two
You already have nine parameters
errors, of course). This example
from the original bug report shows that:
\def\foo#1#2#3#4#5#6#7#8#9#}##{\show#9}
\show\foo
\foo12345678} }#
\end % ^^ delimiter
In the example you have a macro delimited by }#
(the tokens left after
TeX removed the two extra #
), and as part of the scanning, the first
}
would not go through the Argument of \foo has an extra }
error,
so it would be added to the current parameter, then the \show#9
will
say:
> end-group character }.
<argument> }
\foo #1#2#3#4#5#6#7#8#9}##->\show #9
l.39 \foo12345678} }#
?
After the 2021 tuneup, TeX now understands that you meant for #0
to be
a parameter, so naturally the 0
should be removed as well, so now the
error message says a little more:
! You already have nine parameters.
l.1 \def\foo#1#2#3#4#5#6#7#8#9(#0
){}
? h
I'm going to ignore the # sign you just used,
as well as the token that followed it.
?
and the definition will contain no trace of your tenth parameter:
> \foo=macro:
#1#2#3#4#5#6#7#8#9()->.
l.3 \show\foo
?
The relevant change entry is:
433. After nine parameters, delete both # and the token that follows
(Bruno Le Floch, 22 October 2020)
@x module 473
label found,done,done1,done2;
@y
label found,continue,done,done1,done2;
@z
@x module 474
begin loop begin get_token; {set |cur_cmd|, |cur_chr|, |cur_tok|}
@y
begin loop begin continue: get_token; {set |cur_cmd|, |cur_chr|, |cur_tok|}
@z
@x module 476
help1("I'm going to ignore the # sign you just used."); error;
@y
help2("I'm going to ignore the # sign you just used,")@/
("as well as the token that followed it."); error; goto continue;
@z
D954. Garbage visible in buffer after file ends prematurely (DRF). @486
With this bug, the error message File ended within \read
could be
followed by garbage context, if the circumstances were right. Before
the tuneup, if you were \read
ing from a file with one {
too many,
you could see the error message. Suppose a file unbal.tex
with the
single line:
{
and then you run the following document:
\catcode`{=1 \catcode`}=2 \catcode`#=6
\openin1 unbal
\def\A#1#2#3#4#5#6#7#8#9{\read1to \x}
\def\B#1#2#3#4#5#6#7#8#9{\A#1#2#3#4#5#6#7#8#9 \relax}
\def\C#1#2#3#4#5#6#7#8#9{\B#1#2#3#4#5#6#7#8#9 \relax}
\def\D#1#2#3#4#5#6#7#8#9{\C#1#2#3#4#5#6#7#8#9 \relax}
\def\E#1#2#3#4#5#6#7#8#9{\D#1#2#3#4#5#6#7#8#9 \relax}
\E123456789 \end
the error message would start with
Runaway definition?
->{
! File ended within \read.
<read 1> {^^M7#8#9{\D
where ^^M7#8#9{\D
are the leftovers in the buffer
variable.
After the tuneup, TeX now cleans up the buffer
and the error context
is correct:
Runaway definition?
->{
! File ended within \read.
<read 1>
The relevant change entry is:
435. Keep garbage out of the buffer if a |\read| end unexpectedly
(DRF, 17 February 2018)
@x module 486
align_state:=1000000; error;
@y
align_state:=1000000; limit:=0; error;
@z
R955. Force nonexistent characters to have null specs (DRF). @722
This one didn't have a visible effect on normal usage of TeX, so no
compilable example for this one (mostly because I failed to produce a
bad .tfm
file to trigger this bug).
In a .tfm
file, a non-existent character is marked by its width index
being zero, and TeX assumes that if that is true, all other metrics of
said character are zero as well, but nothing was enforced.
If that weren't the case, though, when reading a character from a font,
TeX would only look at its width, and assume everything else is zero,
without enforcing. But if a .tfm
was made so that the width was zero,
but for example the italic correction were not, that index would not be
zeroed and the wrong italic correction would be used.
After the tuneup, if the width of a character is zero, TeX will nullify
the entire character to make sure. The relevant change entry is:
436. Zero out nonexistent chars, to prevent rogue TFM files
(DRF, 06 October 2020)
@x module 722
math_type(a):=empty;
@y
math_type(a):=empty; cur_i:=null_character;
@z
C956. Don't mark fraction noads as temporarily Inner (DRF). @761
This bug had a fix in tex.web
but it was more of a bug in The TeXbook.
In short, some places in The TeXbook, for example the last paragraph on
page 155 used to say:
There’s also an eighth classification, \mathinner
, which is not
normally used for individual symbols; fractions and \left...\right
constructions are treated as “inner” subformulas [...]
but now fractions were removed from that statement. The relevant change
entry in tex.web
is:
437. Don't classify fraction noads as inner noads (DRF, 25 March 2019)
@x module 761
fraction_noad: begin t:=inner_noad; s:=fraction_noad_size;
end;
@y
fraction_noad: s:=fraction_noad_size;
@z
which now doesn't make a fraction an Inner atom any longer. Though that
won't have any change in math typesetting because a fraction was usually
written as {1\over2}
, and the extra braces to enclose the subformula
would make that fraction an Ord atom for all purposes.
The only way to get the fraction as an actual Inner atom was if either
the formula was only a fraction, like $1\over2$
, in which case it
wouldn't make a difference, because of the math boundaries, or if the
fraction was enclosed in a \left...\right
pair, but then it would
become an Inner atom anyway because of \left...\right
. All other uses
of a fraction would result in an Ord atom due to the braces that delimit
the subformula, so this classification was dropped altogether to avoid
confusion.
Proof of that is can be seen from this example:
\nopagenumbers \loggingall \tracingonline=1
Punct: ${1\over2}.$\par
Ord: ${1\over2}x$
\end
If the fractions were actually an Inner atom, according to the math
spacing table on page 170 of The TeXbook, you should have a
\thinmuskip
after the fractions in both cases (followed by a Punct and
followerd by an ord), but if you look at the produced lists, you see
that neither have the space:
.......\sevenrm 2
.....\hbox(0.0+0.0)x1.2, shifted -2.5
...\teni :
and
.......\sevenrm 2
.....\hbox(0.0+0.0)x1.2, shifted -2.5
...\teni x
Q957. Reset |\newlinechar| before logging the stats (Udo Wermuth). @1333,1335
Udo Wermuth reported that you could get some weird terminal output from
TeX depending on how you set the \newlinechar
parameter. For example
running
$ tex '\newlinechar=32 \end'
would print
$ tex '\newlinechar=32 \end'
This is TeX, Version 3.14159265 (TeX Live 2020) (preloaded format=tex)
No
pages
of
output.
Transcript
written
on
texput.log.
because it would make TeX use the space character (ASCII 32) as a
newline character when writing to a file, so all spaces would be
converted. Equally interesting outputs could be achieved by ussing
different ASCII codes. This is now corrected, and the command above
will generate the much boring
$ tex '\newlinechar=32 \end'
This is TeX, Version 3.141592653 (TeX Live 2021/dev) (INITEX)
No pages of output.
Transcript written on texput.log.
The relevant change entry is:
440. Normalize newlinechar when printing the final stats
(Udo Wermuth, 29 November 2020)
@x module 1333
begin @<Finish the extensions@>;
@y
begin @<Finish the extensions@>; new_line_char:=-1;
@z
@x module 1335
begin c:=cur_chr;
@y
begin c:=cur_chr; if c<>1 then new_line_char:=-1;
@z
Missing (\tabskip)
indication in Underfull message
Another bug, found by Igor Liferenko
was a missing \tabskip
glue indication in the Underfull box message of
an alignment, that was otherwise documented in The TeXbook. In this
example:
% \catcode`\{=1 \catcode`\}=2 \catcode`\&=4 \catcode`\#=6
\showboxdepth=1 \tracingonline=1
\tabskip=0pt plus10pt \halign to200pt{&#\hfil\cr
\hbox to50pt{}&\hbox to60pt{}\cr}
\end
the terminal would show
\hbox(0.0+0.0)x200.0, glue set 3.0
.\glue(\tabskip) 0.0 plus 10.0
.\unsetbox(0.0+0.0)x50.0
.\glue(\tabskip) 0.0 plus 10.0
.\unsetbox(0.0+0.0)x60.0
.\glue 0.0 plus 10.0
whereas the last line should be
.\glue(\tabskip) 0.0 plus 10.0
Now the final (\tabskip)
indication now shows properly, as documented.
The relevant change entry is:
438. Properly identify tabskip glue when tracing repeated templates
(Igor Liferenko, 10 January 2020)
@x module 793
link(p):=new_glue(glue_ptr(cur_loop));
@y
link(p):=new_glue(glue_ptr(cur_loop));
subtype(link(p)):=tab_skip_code+1;
@z
Internal variable overflow in \hyphenation
This is probably the least exciting bug found, mainly because with your
everyday TeX you can't spot it. It was found by David Fuchs with a
special version of TeX that he crafted to check memory boundary
violations.
When declaring a \hyphenation
, the variable hn
, declared as a
small_number
(within the range 0..63
) would be maxed out, but then
in module §930, <Look for the word |hc[1..hn]|...>
, that looks for a
given word in TeX's exception table, would call incr(hn)
, making it
larger than the declared size. This document does that:
\lefthyphenmin=0
\righthyphenmin=0
\hyphenation{-a-b-c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z-a-b-%
c-d-e-f-g-h-i-j-k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z-a-b-c-d-e-f-g-h-i-j-%
k-l-m-n-o-p-q-r-s-t-u-v-w-x-y-z}
\showhyphens{abcdefghijklmnopqrstuvwxyzabcdefg%
hijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz}
\end
but it's not noticeable when running TeX because the variable overflow
doesn't appear. What exactly happens is a bit implementation dependent,
as it relies on what the compiler translates a small_number
to. If it
becomes a variable that holds more than 0..63
, nothing bad will
happen.
The relevant change entry is:
439. Use the correct range for local variable hn (DRF, 31 October 2020)
@x module 892
@!hn:small_number; {the number of positions occupied in |hc|}
@y
@!hn:0..64; {the number of positions occupied in |hc|;
not always a |small_number|}
@z
Conclusion
Of course, the one change (and very likely the only one) people will see
is:
-@d banner=='This is TeX, Version 3.14159265' {printed when \TeX\ starts}
+@d banner=='This is TeX, Version 3.141592653' {printed when \TeX\ starts}
:-)
As usual, there were several other changes to all of Knuth's
distribution including, but not limited to, Metafont (some changes that
were ported from TeX due to similar parts of the code), changes to The
TeX and Metafont books, and changes to the plain format (everything
really minimal and unlikely to bite any reasonable user document).
There is also a TUGboat article by Don describing the major
changes to TeX and Metafont, available in the TUGboat web page.
Disclaimer
Most of the code examples in this answer are not my own, but taken from
the original bug reports (some slightly modified), so thanks to the
authors of those bug reports, and thanks also to the dedicated people
that searched for bugs, and to those who read each and every one of the
(probably thousands of) bug reports in the last 7 years. Let's prepare
for another 8 of those!
Best Answer
My dark guesses, freely adapted from Sybill Trelawney's divination lessons:
2013
2020
2028
2037
2047 etc.