What is "expansion"?

Expansion is when: You take a thing that is expandable and ... well ... you expand it.

For example, if you define \newcommand\Aa{\Bb\relax}, then:

  1. \Aa is (once) expandable, and it's expansion is \Bb\relax.

  2. \Aa is not fully expandable, because its 1st expansion contains \relax, which is unexpandable.

Shortly, expansion is the process of substituting macro contents in place of macro call. What TeX-core (and whence LaTeX) does is that it:

  • Expands whatever expandable it sees;
  • Processes whatever un-expandable it sees.

So, for instance a TeX-definition macro \def is unexpandable. So TeX processes it, in the following way (ignoring any specialities): take the first thing after \def -- this is the name of the macro being defined. Take whatever follows until { -- that are the parameters, and take the following one block {...} and that's what the new macro expands to. In this process, TeX of course doesn't expand the things it processes unless it has to, but that's not the case of \def.

On the other hand, when TeX-core sees in LaTeX \newcommand\greeting[1]{Hello #1!}, then it starts expanding \newcommand (which is expandable to some extent), and after many crazy things (and with me not being quite precise), it expands the sequence \newcommand\greeting[1]{Hello #1!} into \def\greeting#1{Hello #1!}. We already know from before that \def is unexpandable, and when this is processed, it will define a macro called \greeting which takes one parameter and which expands to Hello #1! wherever used.

So now that we have our \greeting macro, let's use it! So we write \greeting{glassbjs}. TeX-core sees it, and knowing that \greeting is expandable and takes one argument, it expands the sequence \greeting{glassbjs} into Hello glassbjs!. Since printable characters (letters, spaces, punctuation, ...) are by default unexpandable, TeX-core tries to process them, which means that it converts them into the glyphs from the current font (again I'm not precise) and it adds them to something that is called "the current horizontal list". And at that moment, TeX-core is happy about it and it moves forward.

If you look for a similarity in other languages, expansion is closest to pre-processor substitutions like C's #define f(x) something-here. So the above thing would be in pretty similar to (in C++):

#define greeting(x) cout << "Hello " << x << "!"

However, expansion has some strange rules that do not apply for pre-processing in C. The two main differences are:

  1. Expansion is not pre-processing, it goes along with procession. Which means that:

  2. You can re-define things, and the definition used is the one that is valid in the current scope; in this sense, all macros are much more similar to variables than to functions; they are sensitive to changes, to grouping etc.


Expansion is simply half of how TeX works, so this is not going to be an easy answer.

First of all, TeX operates on tokens (see What is a token?). A token can be either expandable or unexpandable; if it's unexpandable, it obviously never expands, otherwise it can expand. What an expandable token expands to depends on how it has been made part of the language.

Some primitive (that is, whose meaning is predeclared when TeX starts from scratch) tokens are expandable, some aren't. Among the expandable primitives one finds \expandafter, \if and all other conditionals; this was predictable, as these primitive are used to control the expansion process itself; the expansion of an expandable primitive can be void, but it can trigger some other action. All macros defined with \def (along with its variants \gdef, \edef or \xdef) are expandable and their expansion is precisely the replacement text.

So, assume we have

\def\test{some text}

or, in LaTeX, \newcommand\test{some text} which is pretty similar. Under normal operations, if TeX finds

Here is \test

it will absorb Here<space>is<space> (eight unexpandable tokens) and stop at \test, examine it and decide that it is expandable (TeX uses the current meaning of \test), so it replaces it by some text and goes on starting from “s”. Since this token is unexpandable, it's handed over to another phase of processing and TeX goes on with “o” and so on.

Now, suppose that we have

\def\test{\foo{o}me text}
\def\foo#1{s#1}

an input such as

Here is \test

would make TeX hand over tokens up to finding \test, which it would replace with

\foo{o}me text

and \foo would now be expanded to so, after having absorbed its argument.

I said “normal operations”, because in some cases expansion is inhibited:

  1. when TeX is looking for macro arguments;

  2. when TeX is doing a definition with \def or \gdef, with regard to the macro to be defined, the parameter text and the replacement text;

  3. when TeX is doing a definition with \edef or \xdef, but this only concerns the macro to be defined and the parameter text, not the replacement text;

  4. when TeX is examining the token to be defined after \let, \chardef, \mathchardef and all other \...def commands;

  5. when TeX is examining the “right hand side” of a \let;

  6. when TeX is absorbing the tokens for \write, \uppercase, \lowercase, \detokenize, \unexpanded or an assignment to a token register.

Sometimes we talk about "full expansion". What's that? When we do

\edef\baz{<tokens>}

the <tokens> are subject to expansion, the resulting token list is subject to expansion and so on, until only unexpandable tokens remain, always starting from the next token as explained before. After this, the definition is performed, using the token list thus obtained as replacement text. Thus, with the definition above,

\edef\baz{\test}

would expand \test to \foo{o}me text and this to some text, resulting in the same as if we said \def\baz{some text}. Why should we want to do in this way? Because TeX uses the current meaning of the tokens; if we said

\def\baz{\test}

and later change the definition of \foo (or of \test, of course), the macro \baz would expand to something else; with \edef we're sure it expands to some text.

However, \edef doesn't perform assignments. So if we do

\count255=1
\edef\baz{\advance\count255 by 1 \number\count255 }

using \baz would print 1, not 2. This is because \advance and \count are not expandable; character tokens aren't expandable either; \number is expandable and its expansion is the decimal representation of the <number> following it. So this \edef is just equivalent to

\def\baz{\advance\count255 by 1 1}

The same happens when, ultimately, TeX expands the token list in a \write: the expansion is performed "all the way", just like in \edef. Expansion of tokens can be inhibited by preceding the one we don't want to expand with \noexpand or enclosing them as the argument to \unexpanded. Note that this is just a “one shot” activity: if one does

\def\a{A}\def\b{B}\def\c{C}
\edef\foo{\a\b\noexpand\c}

the replacement text for \foo will be AB\c; upon usage of \foo, \c will be expanded according to its current meaning.

Also tokens appearing after \csname and before the matching \endcsname are subject to full expansion; in this case only character tokens must remain, so a \relax is illegal here, while it wouldn't be in \edef or \write.

Complicated? Yes.


Expansion by example, from the TeX Book (p 200, answer on p 328):

EXERCISE 20.2: What is the expansion of \puzzle, given the following definitions?

\def\a{\b}
\def\b{A\def\a{B\def\a{C\def\a{\b}}}}
\def\puzzle{\a\a\a\a\a}

Answer: ABCAB. (The first \a expands into A\def\a{B...}; this redefines \a, so the second \a expands into B..., etc.) At least, that's what happens if \puzzle is encountered when TeX is building a list. But if \puzzle is expanded in an \edef or \message or something like that, we will see later that the interior \def commands are not performed while the expansion is taking place, and the control sequences following \def are expanded; so the result is an infinite string

A\def A\def A\def A\def A\def A\def A\def A\def A...

which causes TeX to abort because the program's input stack is finite. This example points out that a control sequence (e.g., \b) need not be defined when it appears in the replacement text of a definition. The example also shows that TeX doesn't expand a macro until it needs to.

We can see the expansion when using \tracingmacros=1 (see LaTeX \tracing commands list?) and examining the .log of the following MWE:

\documentclass{article}
\def\a{\b}
\def\b{A\def\a{B\def\a{C\def\a{\b}}}}
\def\puzzle{\a\a\a\a\a}
\begin{document}
{\tracingmacros=1

\puzzle}
\end{document}

With some comments:

\puzzle ->\a \a \a \a \a % Expansion of \puzzle contains five \a's

\a ->\b % Expansion of first \a

\b ->A\def \a {B\def \a {C\def \a {\b }}} % Expansion of \b (adds A to output)

\a ->B\def \a {C\def \a {\b }} % Expansion of second \a (adds B to output)

\a ->C\def \a {\b } % Expansion of third \a (adds C to output)

\a ->\b % Expansions of fourth \a

\b ->A\def \a {B\def \a {C\def \a {\b }}} % Expansion of \b (adds A to output)

\a ->B\def \a {C\def \a {\b }} % Expansion of fifth \a (adds B to output)

The final redefinition of \a (to C\def\a{\b}) is never "used".