(Why) Can't we get 'fully expandable' versions of every command?

The problem is that this assumption is basically false:

The point being, TeX does eventually arrive at the final string

It is not like C and its macro pre-processor cpp where all the expansion happens first resulting in an expanded version of the C file which can be passed to the C compiler (or intercepted for debugging or other reasons). In TeX non expandable operations and expansion are inextricably interwoven.

The classic example is

    \setbox0\hbox{hello}\the\wd0

\wd0 is expandable and expands to the decimal expression of the width of box 0. However you can not expand it until you have dome the non-expandable operation of typesetting text into the box.

so given

 \def\foo#1{setbox0\hbox{#1}\the\wd0}

You can not have an expandable version of the command \foo.


The distinction between expandable and non expandable is often arbitrary. So the answer is that in TeX it can not be done. It's like saying my name is David. There are alternative histories where different things may have been true but we have the system we have.

Consider arithmetic:in classic TeX to print one more than some value you have to go

\advance\mycount by 1 \the\mycount

and that is a non-expandable operation. In etex you can go

\the\numexpr\mycount+1\relax

which is an expandable operation. There is no reason that one can argue for why one version of arithemtic is expandable and the other is not. It just is. If someone is extending TeX (eTeX, pdftex xetex, tex part of luatex ) then any new primitive that is added has to be classified as expandable or not expandable as the TeX programming language requires that distinction. Commands are expandable or non expandable because the person who designed them classified them that way.

As noted in the comments, a possible workaround of writing out fragments to a sub-process running a different instance of TeX can not help, even if \write were expandable. The result of evaluating any argument is highly context dependent (it may involve references, definitions within the document or reference the time the job started) it would be impossible in general to generate the same text in the sub process as generated in the original process.


It's probably worth noting that the term in quotes in the question "fully expandable" isn't well defined (or definable) which is part of the reason for the confusion around the topic of expansion. As discussed in an earlier question Advantages and disadvantages of fully expandable macros a better term is "safe in an expansion only context". Commands that are safe in such expansion only contexts as \edef or \write include character tokens and the \relax primitive mentioned in the comments.


Understanding "expandable" as TeX does is tricky, because there are really two kinds of result for a TeX command sequence (something such as \foo, which is commonly called a "macro", but which I won't because that has the baggage of macro expansion, which I'm trying to demystify rather than further confuse). I'll be using made-up terms for these.

  • The first kind of result is input, and this is what happens after "expanding" \foo. This just replaces \foo with other tokens that can be further scanned by TeX.

  • The second kind of result is output, and this is what happens when the result is no longer scanned by TeX. This has two sub-types:
    (a) Internal output. This is what the programming constructs like \def and \let do, as well as other assignments like \count0=1 that aren't exactly implemented by a control sequence.
    (b) External output. This is what is typeset. However, even if nothing ever made it to the page, it is still placed in a box in memory, and the contents of the box are rendered with fonts and explicit spacing.

TeX's operation is a long crawl through a stream of input tokens that are either converted to external output (if they are directly typesettable material like letters or other font symbols, or are boxes and such), to internal output, if they are assignments of some kind and thus "save" their result for later use, or to more input, if they simply expand right back into the input stream. These processes can be mixed, of course, such as when you assign to a box register (as internal output) a box that was produced as external output.

The process of expansion is blind: it does not interact with the mechanisms that produce external output. These are the measurement and placement mechanisms, so the example of \setbox0=\hbox{hello}\the\wd0, in passing through the external output mechanism to do the width computation, is necessarily non-expandable, because width does not exist at the input level. (As it happens, it also has to pass through the internal output mechanism to save the box, but that is dictated only by the design of TeX that forbids you from writing \the\wd\hbox{hello}.)

Now, you ask for the "final result" of something that is typeset to be available as input. I think what you mean is the following:

You want to circumvent the restriction on expanding internal output, i.e. you want \def\foo{hello}\foo to be converted to hello. This sounds like a reasonable request that, as David Carlisle said, is simply impossible. TeX designates \def to be non-expandable and that's it. One wonders (I once did) why \def does not "expand" to nothing as well as making the definition, and there's no reason I know of, but the fact is that as things stand, it passes through the internal output mechanism and therefore does not directly manipulate the input stream.

If I may philosophize some more, what you are requesting is I think a common way of "doing it wrong". You want to issue a programming construct that results in further input that is also valid as a programming construct for producing external output. Since the two languages, though intertwined, are actually different, this is impossible. That doesn't mean there aren't equivalent ways of writing your input construct so that it creates what you want. A common workaround is to do a lot of non-expandable stuff that repeatedly builds a macro \result, and then its contents are available expandably afterwards (this is what the previous examples do, actually).


Maybe David's answer can be complemented by a non-TeX perspective.

As TeX macro processing is comparable to a programming language the notion of expandable versus non-expandable does exist in computer science but with different wording.

Essentially a fully expandable macro or function is a pure function, i.e. a function without side effects that does not depend on anything but it's arguments. In pseudo-C:

int twice(int x)
{
    return 2*x;
}

For a given string it does not change the output whether we replace the function by it's return values: twice(4) is equal to 8. Now we take a slightly modified function and end up with something that is inherently non-expandable:

int twice_current_page()
{
    return 2*current_page;
}

twice_current_page() cannot be replaced/expanded in the document. The only way to get the result is to execute everything up to this point and then execute the function body.