Expandable full expansion of tokens that preserves catcodes

Did you try using \romannumeral? This is used a lot for this type of thing (see for example the \exp_args:Nf concept in expl3):

\def\fullyexpand#1{\romannumeral - `0#1}

This works because TeX will keep expanding #1 looking for a number, which will always turn out to be negative, so the Roman numeral will vanish. Note that this solution will stop on the first non-expandable token, unlike an \edef which will keep going.

It's possible to build a function which can expand using \romannumeral 'around' unexpandable tokens. For example, the following code will work reasonably well:

\long\def\fullyexpand#1{%
  \csname donothing\fullyexpandauxi{#1}{}%
}
\long\def\fullyexpandauxi#1{%
  \expandafter\fullyexpandauxii\romannumeral -`0#1\fullyexpandend
}
\long\def\fullyexpandauxii#1#2\fullyexpandend#3{%
  \ifx\donothing#2\donothing
    \expandafter\fullyexpandend
  \else
    \expandafter\fullyexpandloop
  \fi
  {#1}{#2}{#3}%
}
\long\def\fullyexpandend#1#2#3{\endcsname#3#1}
\long\def\fullyexpandloop#1#2#3{%
  \fullyexpandauxi{#2}{#3#1}%
}
\def\donothing{}

However, this is not the same as \expanded, for a few reasons. First, my implementation will strip out spaces in the argument (as it does a loop, and TeX will skip spaces). Braces will also get stripped out. A bit of testing also reveals that \romannumeral will expand \protected functions here, whereas \expanded does not. I'd also note that the above code needs some guards adding for a blank (empty or all space) argument, as currently things fail in these cases.


With current release LuaTeX one can use \expanded, which does more-or-less the same as an \edef but is expandable (it doesn't required doubled # tokens also). This primitive will be in TeX Live 2019 pdfTeX/e-pTeX/e-upTeX, and hopefully in XeTeX (yet to be confirmed). As a precursor to this, expl3 has a macro-based emulation, slow but working, which does token-by-token examination and allows 'e-type' expansion.


On the aside, it is possible to use \scantokens expandably, but as you may have found this can be tricky and it is usually necessary to have a (non-expandable) change of \everyeof first. LuaTeX addresses this issue with the \scantextokens primitive, which combines this end-of-file stuff directly into the primitive. Of course, if you are using LuaTeX then the original problem is solvable anyway, since \expanded is available.


After learning many tricks on this forum, I propose the following solution. I think it does what @TH. wanted, namely it expands everything, and when it meets a non-expandable token, it just stores it and continues. It uses a future package of mine, whose code can currently be found online, or as my answer to this question. This code is input in the first line below. Set whichever filename you want.

Note that I have not been careful to reset catcodes to their default value in ULcase.sty, so here, we don't need \catcode`:=11\catcode`_=11\relax. One day, I'll clean this up.

% Code based on the extended Upper- and Lower-casing code found
% in the ULcase package.
\input ULcase.sty\relax

% ============ Table |fullyexpand|, |\fullyexpand|
% The |fullexpand| table expands every token that is expandable
% according to the test \FE_token_if_expandable:NTF.
% 
% Then |\fullyexpand| is basically changing the case using a special
% "case table", |ULfullyexpand|. Here, |\MEA_trigger:f| is |\romannumeral|
% in disguise, forcing the full expansion of |\UL_to_case_aux:nn|
% (This is necessary for technical reasons.)
%
\def\fullyexpand{\MEA_trigger:f\UL_to_case_aux:nn{ULfullyexpand}}


% A few tests, building up to \FE_token_if_expandable:NTF.
%
\long\gdef\FE_token_if_expandable:NTF#1{%
  \FE_token_if_defined:NTF#1%
  {%
    \FE_token_if_eq_noexpand_self:NTF#1%
    {\use_ii:nn}%
    {\FE_token_if_protected:NTF#1{\use_ii:nn}{\use_i:nn}}%
  }%
  {\use_ii:nn}%
}
\long\gdef\FE_token_if_defined:NTF#1{%
  \ifdefined #1%
    \expandafter\use_i:nn%
  \else%
    \expandafter\use_ii:nn%
  \fi%
}%
\long\gdef\FE_token_if_eq_noexpand_self:NTF#1{%
  \expandafter\ifx\noexpand#1#1%
    \expandafter\use_i:nn%
  \else%
    \expandafter\use_ii:nn%
  \fi%
}
% |\expandsome| only expands the tokens following |\expandthis|.
\expandsome{%
  \long\gdef\FE_token_if_protected:NTF#1{%
    \expandafter\FE_token_if_protected_aux:w\meaning#1%
    \expandthis\string\protected\q_stop}%
}
% |\expandafter:nw{...}\foo| expands |\foo| before |...|.
\expandafter:nw{\long\gdef\FE_token_if_protected_aux:w#1}%
\string\protected#2\q_stop{%
  \UL_if_empty:nTF{#1}%
}%

% ===== Building the table.
% We copy the standard definitions (in particular for braces)
% NB: maybe problem with \NoCaseChange.
\UL_new_table:n{ULfullyexpand}
% 
% Spaces are just kept:
\UL_setup:nnn{ULfullyexpand}{ }{ }
% 
% The default action is to check if #3 (the next token) is expandable.
% If it is, we expand it. Otherwise, we output it. "#2{#3}" is responsible
% for continuing the loop.
\long\gdef\UL_table_ULfullyexpand_default#1#2#3{%
  \FE_token_if_expandable:NTF#1{% 
    \expandafter:nw{#2{#3}}#1%
  }{%
    \UL_to_case_output:n{#1}#2{#3}%
  }%
}%
%
% Define |\noexpand|
\long\expandafter\gdef\csname UL_table_ULfullyexpand_%
  \detokenize{\noexpand}\endcsname#1#2#3{%
    \UL_to_case_output:n{#3}#1{#2}}
%
% Define |\detokenize|
\long\expandafter\gdef\csname UL_table_ULfullyexpand_%
  \detokenize{\detokenize}\endcsname#1#2{%
    \expandafter:nw{\FE_detok_unexp_aux:nnNn{#1}{#2}\detokenize}%
    \romannumeral-`\0}
%
% Define |\unexpanded|
\long\expandafter\gdef\csname UL_table_ULfullyexpand_%
  \detokenize{\unexpanded}\endcsname#1#2{%
    \expandafter:nw{\FE_detok_unexp_aux:nnNn{#1}{#2}\unexpanded}%
    \romannumeral-`\0}
% 
% A helper for |\detokenize| and |\unexpanded|.
\long\gdef\FE_detok_unexp_aux:nnNn#1#2#3#4{%
  \expandafter\UL_to_case_output:n\expandafter{#3{#4}}%
  #1{#2}}%


% ===== Tests
\long\gdef\fooA#1{\fooC{\noexpand\fooA got #1} \fooB{a nice #1}}
\long\gdef\fooB#1{fooB\space got <#1>}
\protected\def\fooC#1{Not expanded.}
\long\gdef\a{\fullyexpand{Text: \fooA{argument}}}
\expandonce\a\expandonce\a
\long\gdef\b{Text: \fooC {\fooA got argument} fooB got <a nice argument>}
\checkoutput

\long\gdef\a{\fullyexpand{Text:%
    \detokenize\expandafter{\fooA{argument}\fooA{hi}}}}
\long\xdef\b{Text:\detokenize{\fooC {\noexpand \fooA got argument} %
    \fooB {a nice argument}\fooA {hi}}}
\expandonce\a\expandonce\a
\checkoutput

\long\gdef\a{\fullyexpand{Text:%
    \unexpanded\expandafter{\fooA{argument}\fooA{hi}}}}
\long\xdef\b{Text:\unexpanded{\fooC {\noexpand \fooA got argument} %
    \fooB {a nice argument}\fooA {hi}}}
\expandonce\a\expandonce\a
\checkoutput


\long\gdef\foo{foo}
\long\gdef\a{\fullyexpand{a\foo \ifnum1=1\number2 yes\else no\fi}}
\expandonce\a\expandonce\a
\long\gdef\b{afoono}

\endinput

Exhaustive (\edef-like) expansion is not possible with pdfTeX 1.40 in an expandable way. All suggested methods differ in their behavior from \edef:

  • Using \csname fails whenever the exhaustive expansion contains non-character tokens (e.g. primitives or protected macros)
  • Using \romannumeral expands only until the first unexpandable token, as explained by Joseph.