Question about exercise 11.5 in TeXbook
For readers who don't find the line:
\def\\{\if\space\next\ % assume that \next is unexpandable
in their TeXbook, it is in the errata published by Donald E. Knuth (“page A311”).
Suppose you do:
\def\foobar{cat}
\noindent
\demobox{The \foobar\ in the hat}
After \next
has grabbed the \foobar
token, \\
will expand to something equivalent to:
\if\space\next\ %
\else \setbox0=\hbox{\next}\maketypebox\fi
with \next
being \let
-equal to \foobar
. According to the documentation of \if
(TeXbook p. 209), TeX is going to expand tokens following the \if
until it finds two non-expandable ones. \space
expands to an explicit space token in one step, so TeX goes on with \next
(which has the same meaning as \foobar
at this point), because it needs one more non-expandable token. After \next
has been expanded, the input is equivalent to:
\if〈space token〉cat\ %
\else \setbox0=\hbox{\next}\maketypebox\fi
where 〈space token〉 represents an explicit space token (one could define a control sequence that is \let
-equal to an explicit space token and use it instead of 〈space token〉, see footnote 1 below). Now, TeX has two non-expandable tokens following the \if
: a space token and a c
character token (of category 11 under the normal catcode regime). So, the outcome of the \if
can be decided: it is false because the character codes of a 〈space token〉 and of c
differ, so TeX will skip to the \else
clause.
There is no big problem so far, though we are going to box the whole cat
at once instead of each character separately (c
, a
, and t
); but let's back up a little bit. Had we used:
\def\foobar{ cat}
the input would have been equivalent to:
\if〈space token〉〈space token〉cat\ %
\else \setbox0=\hbox{\next}\maketypebox\fi
The test would have been true and TeX would have left cat\
in the input stream, which is plain wrong, because we were supposed to test what we just grabbed in \next
, not to insert new text!
So, the comment “assume that \next
is unexpandable” could be rephrased more generally, in my humble opinion, as “assume that \next
ultimately expands to either (1) exactly one character token or (2) exactly one \chardef
token or (3) a control sequence token that is \let
-equal to (1) or (2)”2 (the character token in (1) is necessarily non-active, because of the “ultimately”). Indeed, you can test that \demobox
works perfectly when \next
recursively expands to a single character token, as in:
\def\myspacei{\myspace}
\def\myspace{\space}
\noindent
\demobox{Abc def\myspacei pU gHi}
Using \myspacei
here gives the same result as using an explicit space token, because it recursively expands to such a token.
Here is another example that additionally uses a control sequence that recursively expands to a non-space character token:
\def\myspacei{\myspace}
\def\myspace{\space}
\def\myxii{\myxi}
\def\myxi{\myx}
\def\myx{X}
\noindent
\demobox{Abc def\myspacei pU\myxii gHi}
Your proposal:
\def\\{\expandafter\ifx\space\next\ %
...
would also work, as long as \next
has been \let
-equal to a space token (explicit or implicit). But it wouldn't work with input containing spaces in the form of macros like \space
or our \myspacei
macro defined above. Indeed, \ifx
distinguishes between character tokens and macros (see specification of \ifx
p. 210 of the TeXbook).
Finally, although it would work, your replacement of \endlist
with \end
does not sound like the best coding style to me, because \end
is an existing TeX primitive; Knuth chose something more “unique” to mark the end of the text to be worked on. Besides, the name \endlist
was visibly chosen to match \dolist
: it is a matter of consistency. See in particular:
\def\demobox#1{\setbox0=\hbox{\dolist#1\endlist}%
...
Footnotes
You can define a control sequence
\stoken
that is\let
-equal to an explicit space token like this:{\def\\{\global\let\stoken= }\\ }% now, \stoken is an implicit space token
(adapted from the TeXbook p. 376). Two other ways are given in the TeXbook p. 336 (exercise 24.6):
\def\\{\let\stoken= }\\ %
and
\def\\#1\\{}\futurelet\stoken\\ \\%
This is in particular the case when
\next
has been\let
-equal to a non-active character token—which is, I think, the case Knuth had in mind when he used the word “unexpandable” (indeed, a non-active character token, or a control sequence that has been\let
-equal to such a token, never expands). In other words, the condition “\next
is unexpandable” from the comment you quoted is a sufficient condition for ensuring that the macros behave sanely, and is only a particular case of the more general condition I gave. :-)
It means what it says: only unexpandable tokens are allowed in the argument to \demobox
. More precisely, character tokens or unexpandable control sequences that correspond (via \chardef
or \let
) to printable characters (including spaces).
If you try
\demobox{abc def}
\def\expandable{expandable token}
\demobox{\expandable}
you get
which is probably not what you were expecting. On the other hand, a definition like
\def\expandable{ expandable token}
would yield something very far from the expectations.
So the \demobox
macro can be used only with its argument consisting of unexpandable tokens or macros expanding to a single unexpandable token.
Also \chardef
tokens are allowed, as well as implicit character tokens. However \bgroup
would also be problematic: compare \demobox{A \bgroup AB\egroup}
with \demobox{A AB}
, to see the issue.
You might want to extend \demobox
in various ways, but that's not the object of the exercise.
About your suggestion to redefine \dolist
to use \end
instead of \endlist
: you can do it, if you prefer. Knuth doesn't prefer it; instead he uses a control sequence that's specific to \dolist
processing. Note that the definition of \endlist
will produce an infinite loop whenever \endlist
is expanded (probably by a mistake in the macros using \dolist
), whereas \end
wouldn't.