What transformations does a macro's text undergo before it is saved to memory?
Basically: What TeX stores is not text (sequence of characters) but tokens: after your \def
it remembers that the definition of \a
has the control sequence \relax
followed by macro parameter 1, and when you ask for this token list to be printed, it shows you a space after the \relax
token, just for clarity.
I'm not sure about the “user”-level description of this as described in The TeXbook etc., but I can point you to the “internal” representation as described in TeX: The Program. See Part 20 (Token Lists) of texdoc tex
. A token is either a (command code, character)
pair, or a control sequence. With that background:
- The
\def
you type gets stored as a sequence of tokens:
- Then the above token list, when you ask for the definition of
\mac
to be shown, gets shown as:
(In this example the text happens to be identical, but note that you could have left out the space after the \b
in the input text, and the internal token list representation would have been the same, so you would still have a space after the \b
if you asked for it to be printed.)
The code
\def\a#1{\def\b#1{Hello, world!}}%
\a c%
\bc%
defines \a
with one parameter; upon calling \a c
, #1
becomes c
, so what's executed is
\def•\b•c•{•H•e•l•l•o•,• •w•o•r•l•d•!•}
which defines \b
with parameter text c
and replacement text Hello, world!
. (•
is used here to separate tokens, for better clarity.)
If you do \show\a
, you get
> \a=macro:
#1->\def \b #1{Hello, world!}.
which uses a different representation (a space after a control word), but this amounts to exactly the same thing: \b
has already been tokenized. If you want to define \bc
, you need
\def\a#1{\expandafter\def\csname b#1\endcsname{Hello, world!}}
Now \a c
will define \bc
to have replacement text Hello, world!
.
As said before, the representation with \show
or \meaning
adds a space after control words, which is unconsequential if you copy and paste the shown code, as spaces are ignored in that position. This way you can see the division into tokens. Of course, it's possible to be misguided if something strange is done:
\catcode`z=12
\def\a#1{\def\z#1{Hello, world!}}
\show\a
would display
> \a=macro:
#1->\def \z#1{Hello, world!}.
but \a c
would not define a macro \zc
, because #1
is still a separate parameter token.
No transformation whatsoever is done to the tokens during \def
, except for tokenization (which fixes category codes, by the way).
The category code of the characters forming the name of a macro are not recorded in any way. However, TeX will add a space in the \show
/\meaning
representation only after control words, not control symbols. So if you add
\catcode`z=11
\show\a
to the code above, the second \show
will output
> \a=macro:
#1->\def \z #1{Hello, world!}.
because at this point, \z
is to be considered a control word. This has more to do with input from a .tex
file, rather than with tokens: always remember that already stored tokens are not touched by category code changes, but input ones are.
A real world example is the LaTeX macro \@
. In code where the category code of @
is 11 (which is frequently done for defining “private” macros), \@
still has the same definition as when @
has category code 12. However, care is needed when using \@
in code, because
\@x
is read in as two tokens in normal situations (catcode of @
is 12), but just one when the catcode of @
is 11. Not a big deal, but something to be aware of.