How are parameter tokens (#1,#2,...,#9) processed?
tex macros have two kinds of argument, delimited and non delimited, for a non delimited argument the argument is either a single token, or if the token is an explicit brace (a character of catcode 1) then the argument is all the balanced text up to the matching }
(character of catcode 2) in the latter case the braces are not passed as part of the argument. So if you have
\def\xxx#1{...#1...}
Then after \xxx Z
then #1
will be the single token Z
but after \xxx {ab{c}}
it will be the 5 tokens ab{c}
Delimited arguments are similar but match all tokens up to a specified sequence of tokens (]
in your example above) after
\def\yyy#1@?@{...#1...}
then after \yyy abc @?@
then #1
is the 4 tokens abc
and the same tokens are passed if the input is \yyy {abc }@?@
as if a delimited argument would consist just of a brace group, the outer level of braces is stripped.
\show
only ever shows a single token so \show #1
if #1
is one
is the same as \show one
which will show o
and typeset ne
The question
Is the recursive line
\expandafter\setarrayItem\fi
a kind of implicit loop for parameter token munching?
isn't particularly related to parameters other than the \fi
closes the \ifx\end#1
test which means that if #1
was not end, the macro recursively calls itself in this branch, the branch when #1
is \end
is empty, so stopping the iteration.
Parameter tokens #1
to #9
are only relevant at macro definition time, so you're being misled when thinking to them.
The macro \setarray
has two undelimited arguments (because the parameter token are not separated from each other by anything). This means that TeX will look for two arguments when expanding \setarray
.
When looking for an undelimited argument (again, the “delimited” or “undelimited” only refers to how the macro has been defined), TeX skips space tokens until finding a nonspace one. There are two cases:
- the nonspace token is not a
<left brace>
- the nonspace token is a
<left brace>
(meaning an explicit {
or any other token with category code 1, but let's not complicate things).
In the first case, the nonspace token is substituted for the corresponding parameter in the replacement text. In the second case, TeX continues scanning the input looking for the matching <right brace>
(so keeping track of brace nesting). When it has found it, it strips off those outer braces and substitutes the whole set of absorbed tokens in place of the corresponding parameter.
Thus, with \def\foo#1#2{-#1-#2-}
, the calls
\foo\bar\x
\foo\bar{abc}
\foo{abc}\bar
\foo{abc}{def}
will result in delivering, respectively,
-\bar-\x-
-\bar-abc-
-abc-\bar-
-abc-def-
to the main token list for further processing.
Let's see what \setarray\groups{{one}{two}{three}}
does; by the rules above, #1
is replaced with \groups
and #2
by {one}{two}{three}
, so the new token list will be
\itemidx=0 \edef\tmp{\string\groups}\setarrayItem{one}{two}{three}\end
The two assignments are performed and we remain with
\setarrayItem{one}{two}{three}\end
According to its definition, \setarrayItem
has one argument; the rules above say it's {one}
(but the braces will be stripped off), so we get
\advance\itemidx by1
\ifx\end one\else
\expandafter\def\csname data:\tmp:\the\itemidx\endcsname{one}%
\expandafter\setarrayItem\fi
{two}{three}\end
(line breaks and %
don't really make sense in token lists, I use them just for clarity). The assignment is performed and disappears (\itemidx
will contain the value 1). Then the \ifx
test is performed, comparing \end
with o
; since the two tokens are different, the tokens up to \else
are swallowed, so we remain with
\expandafter\def\csname data:\tmp:\the\itemidx\endcsname{one}%
\expandafter\setarrayItem\fi
{two}{three}\end
OK, \expandafter
acts on \csname
which will build a symbolic token; we'll be left with
\def\data:\groups:1{one}\expandafter\setarrayItem\fi{two}{three}\end
where, remember, \data:\groups:1
is a single token. The definition is performed and we're left with
\expandafter\setarrayItem\fi{two}{three}\end
Here \expandafter
expands \fi
(that leaves nothing), so we obtain
\setarrayItem{two}{three}\end
and the same as before will be repeated causing the definition of \data:\groups:2
and \data:\groups:3
. At the next iteration, we'll be left with
\setarrayItem\end
and now we'll have
\advance\itemidx by1
\ifx\end\end\else
\expandafter\def\csname data:\tmp:\the\itemidx\endcsname{\end}%
\expandafter\setarrayItem\fi
The counter is advanced, then \end
is compared to \end
: oh, the test returns true! So nothing is removed except the test tokens, so we remain with
\else
\expandafter\def\csname data:\tmp:\the\itemidx\endcsname{\end}%
\expandafter\setarrayItem\fi
What's the expansion of \else
? It consists in swallowing everything up to the matching \fi
and make everything found disappear. End of the recursion. To recapitulate, we have defined three macros (with complicated names.
In the case of
\setarray\nogroups{one two three}
the routine will do the definitions
\def\data:\nogroups:1{o}
\def\data:\nogroups:2{n}
\def\data:\nogroups:3{e}
\def\data:\nogroups:4{t}
\def\data:\nogroups:5{w}
\def\data:\nogroups:6{o}
\def\data:\nogroups:7{t}
\def\data:\nogroups:8{h}
\def\data:\nogroups:9{r}
\def\data:\nogroups:10{e}
\def\data:\nogroups:11{e}
because the spaces between e
and t
and between o
and t
will be ignored because of the rule by which undelimited arguments are looked for.
The macro \getarray[1]\groups
will build the control sequence named
\data:\groups:1
and its expansion will deliver one
; with \getarray[1]\nogroups
, the control sequence
\data:\nogroups:1
is built and its expansion delivers o
.
Now you should compare all of the above with this rough implementation in expl3
, where the code is almost self-explaining (but less fun):
\usepackage{expl3}
\ExplSyntaxOn
\cs_new_protected:Npn \setarray #1 #2
{
\tl_clear_new:N #1
\tl_set:Nn #1 { #2 }
}
\cs_new:Npn \getarray [#1] #2
{
\tl_item:Nn #2 #1
}
\ExplSyntaxOff
Not that I recommend the syntax \getarray[1]\groups
, as the square brackets seem extraneous to the context. This even allows, out of the box, to call \getarray[-1]\groups
to access the last item in the array.
Oh, and
\setarray\foo{{\textbf{a}}{\textit{a}}{\textsf{a}}}
\edef\baz{\getarray[1]\foo}
would work and store \textbf{a}
in \baz
. Try it with the (admittedly clever) code by wipet.