What definition takes up more memory, \def or \chardef (considering \def contains a single character)?
The answer is quite easy: \chardef
uses much less resources.
Compiling
%\def\usedef{\expandafter\def\csname\the\count255\endcsname{a}}
\def\usedef{\expandafter\chardef\csname\the\count255\endcsname=`a }
\count255=1
\loop\ifnum\count255<500000
\usedef
\advance\count255 1
\repeat
\tracingstats=1
\bye
I get
Here is how much of TeX's memory you used:
499994 strings out of 2095124
2888925 string characters out of 6220659
5959 words of memory out of 5000000
500917 multiletter control sequences out of 15000+600000
14794 words of font info for 50 fonts, out of 8000000 for 9000
14 hyphenation exceptions out of 8191
5i,0n,1p,81b,6s stack positions out of 5000i,500n,10000p,200000b,80000s
Switching the comment in the first two lines, TeX will perform \def
instead of \chardef
and the log will show
Here is how much of TeX's memory you used:
499994 strings out of 2095124
2888921 string characters out of 6220659
1505944 words of memory out of 5000000
500917 multiletter control sequences out of 15000+600000
14794 words of font info for 50 fonts, out of 8000000 for 9000
14 hyphenation exceptions out of 8191
5i,0n,1p,82b,6s stack positions out of 5000i,500n,10000p,200000b,80000s
Note. I had to supply more memory space for strings with max_strings=10000000
in order to make space for 500000 new control sequences.
As \chardef
can only associate a control sequence with an integer value from 0 to 255 while \def
can associate it with arbitrary token lists, the latter is more general, and it shouldn't be surprising that it is also more expensive. In fact, The TeXbook introduces \chardef
as "for numbers between 0 and 255, as an efficient alternative to" \def
, and indeed it is.
As a further refinement of @egreg's answer, which uses either \def
or \chardef
500000 times and examines the stats, we can instead, for each (fixed) one of those two, compare the stats after using it (say) 1000 and 2000 times (or 1000 and 1001 times, or whatever), to see the additional (incremental) cost of each definition.
What you'll find is that for either \def\testName{a}
or \chardef\testName`\a
, the cost in common (for each such definition) is:
1 string entry (namely, the new string
testName
),8 string characters (namely, to store the string
t
e
s
t
N
a
m
e
) (depending on the length of your name),1 "multiletter control sequence" (assuming your name was of length greater than 1).
(Note that TeX has some optimizations to avoid storing things that already exist, so in some cases it may be able to avoid some of this cost, but not in general.) For \chardef\testName`\a
this is the only cost, but for each \def\testName{a}
there is the above cost and also a cost of:
- 3 words of memory
The reason for this is not hard to see. All that \chardef
has to do is put an integer value in a table. For example, your \chardef\testName`\a
is the same as \chardef\testName=97
. (For example, plain.tex
uses \chardef\active=13
.) So when you write \chardef\testName=97
, the bookkeeping that TeX does internally is something like the following:
- in the string pool: append eight characters
testName
. - in the list of strings (not explicitly stored): point from the latest string number (incremented by 1) to the starting position of the above 8 characters in the string pool
- in the list of control sequences: point from the hash of
testName
to the above string number - in the (internal) table of equivalents (
eqtb
), which (among other things) stores the meanings of control sequences: store a new entry witheq_level
the current/appropriate level,eq_type
the special codechar_given
(saying that the meaning of\testName
is that it's a chardef'd number),equiv
being the value itself (here, the integer97
).
You can see this in section §1224 of the program:
With \def
on the other hand, TeX internally has to create a token list containing the token a
(i.e. catcode 11=letter, and value 97). Further, every token list starts with a reference count and ends with NULL, which explains why there are three words of memory used.
So to summarize the difference, just look at the following two cases (with apologies that the diagram was not made with TikZ ): storing the integer 97 in an array is free (requires no extra space), while storing a pointer to a 3-word token list requires those three words of space.