What exactly do \csname and \endcsname do?
Normally, control sequence names are made only of letters or of one non-letter character.
A letter is, more precisely, a character having category code 11 at the moment the control sequence name is read. So, any character can become part of a control sequence name, provided we change its catcode before the definition and each usage.
With \csname...\endcsname
we are freed from this limitation and every character can go inside them to form a control sequence name (of course, %
is excluded because it disappears together with what remains on the line before TeX is doing its work on characters).
However, this is not the main purpose of \csname...\endcsname
. This construction is used to build commands from "variable parts". Think, for instance to LaTeX's \newcounter
: after \newcounter{foo}
, TeX knows \thefoo
that is built precisely in this way. Roughly, what LaTeX does is
\newcommand{\newcounter}[1]{%
\expandafter\newcount\csname c@#1\endcsname
\expandafter\def\csname the#1\endcsname{\arabic{#1}}%
}
so that \newcounter{foo}
does the right job. It's more complicated than this, of course, but the main things are here; \newcount
is the low-level command to allocate a counter. The \expandafter
is just to build the control sequence before \newcount
and \def
see the token.
Inside \csname...\endcsname
, category codes don't matter (with one main exception: active characters will be expanded if not preceded by \string
, see final note). LaTeX exploits this in order to build control sequence names that users won't be able to access (easily). For example, the control sequence to choose the default ten point font is \OT1/cmr/m/n/10
, which can be easily split internally (by the "reverse" operation that is \string
) and is not available to the casual user.
Another important use is in environments: when you say \newenvironment{foo}
, LaTeX really defines \foo
and \endfoo
. Upon finding \begin{foo}
, LaTeX does some bookkeeping and then executes \csname foo\endcsname
(that's why one can say also \newenvironment{foo*}
); similarly, at \end{foo}
LaTeX executes \csname endfoo\endcsname
and after this it does some bookkeeping again.
Other uses: \label{foo}
will define control sequences based on foo
via \csname...\endcsname
that can be used by \ref
.
When one says \csname foo\endcsname
, LaTeX will look whether \foo
is defined; if not, it will execute \relax
and from then on (respecting grouping), \foo
will be interpreted as \relax
. An interesting usage for this feature is that one can say
\chapter*{Introduction}
\csname phantomsection\endcsname
\addcontentsline{toc}{chapter}{Introduction}
and keep hyperref
happy if it's loaded, while doing nothing if the package is not loaded.
It's possible to give many other interesting uses of this trick. But one should always keep in mind that TeX does complete expansion of what it finds in that context and that only characters must remain. So
\csname abc\relax def\endcsname
is forbidden. But, after \def\xyz{abc}
,
\csname \xyz def\endcsname
will be legal and equivalent to saying \csname abcdef\endcsname
or \abcdef
.
Final note
It's better to add something about category codes. An active character in \csname...\endcsname
will be expanded, so to get a literal ~
one has to write \string~
. Comment (category 14), ignored (category 9) and invalid (category 15) characters will remain such. So
\csname %\endcsname
will give an error (Missing \endcsname
); in \csname ^^@\endcsname
there will be no character and \csname ^^?\endcsname
will raise an error.
For reference, from the TeX Book (with slight formatting changes), Chapter 7: How TeX Reads What You Type (p 40):
...you can go from a list of character tokens to a control sequence by saying
\csname<tokens>\endcsname
. The tokens that appear in this construction between\csname
and\endcsname
may include other control sequences, as long as those control sequences ultimately expand into characters instead of TeX primitives; the final characters can be of any category, not necessarily letters. For example,\csname TeX\endcsname
is essentially the same as\TeX
; but\csname\TeX\endcsname
is illegal, because\TeX
expands into tokens containing the\kern
primitive. Furthermore,\csname\string\TeX\endcsname
will produce the unusual control sequence\\TeX
, i.e., the token<\TeX>
, which you can't ordinarily write.
I have used this indirectly by using the \label
-\ref
system and defining labels based on counters:
\newcounter{mycount}
%...
\newcommand{\mycmd}{%
\stepcounter{mycount}%
\label{abc\themycount}%
%...
}
This creates a "successive label abc1
, abc2
, ... for every call to \mycmd
, in order to avoid creating multiply defined labels with the same name. Indirectly, \label{abc\themycount}
calls \@namedef{r@abc\themycount}
, which calls
\expandafter\def\csname r@abc\themycount\endcsname
thereby expanding r@abc\themycount
to r@abc1
and defining \r@abc1
for the first label, \r@abc2
for the second label, etc. Yes, labels in LaTeX are actually control sequences prepended with r@
and is constructed using \csname ... \endcsname
which then allows numerals.
Suppose you want to define a command \foo2
. You cannot do this because 2 is not a letter. However, this construction works: \csname foo2\endcsname
. Sometimes this is useful, e.g. when you need a series of commands, \foo1
, \foo2
, etc (another way is to use roman numerals). Another example, suppose you want to define a series of commands like \endsection
, \endsubsection
, etc. Then you can use a loop with \expandafter\def\csname end#1\endcsname...