What does the comment-pln code actually do?
The definition of \newcodes@
saves the current category codes of \ { } # %
into macros and changes them to 12.
Making the backslash into an “other character” is necessary not in order to block interpretation of macros, which would be discarded before expansion is attempted, but in order to avoid TeX seeing conditionals and keep their nesting in skipped branches. Also appearances of \comment@@@
in the commented block would ruin everything.
Braces are made catcode 12 so TeX will not try and keep their nesting into account. Without this, a {
in a line and }
in another line might be disastrous. The percent is made other so that it doesn't mask the end-of-line.
I see no apparent reason for changing the catcode of #
; however this avoids its “doubling” when absorbed in the argument of a macro.
The other special characters are not a problem, because all found tokens will be discarded. They'd need treatment if we wanted to do verbatim printing.
The macro \oldcodes@
restores the category codes that have changed.
Now the most important macros.
First \begincomment
executes \newcodes@
and sets the end-of-line mechanism to deliver character number 10, in double-hat notation ^^J
. Then \comment@
is executed.
What's its job? It looks forward for ^^J
(that is up to the end of the line), making the argument to be whatever is found in between; then it delivers \comment@@
followed by the argument followed by the characters \endcomment
, the first category code 12, the other 11, ending with \comment@@@
.
The \lowercase
is used so every !
in its argument is changed into a backslash before doing anything else; since no uppercase character appears, that's the only change. Remember that \lowercase
only affects characters and leaves control sequences alone.
Thus the macro \comment@@
is defined to have an argument delimited by \endcomment
(not a macro). Its main purpose is to discard the argument, but it also does \futurelet\next\comment@@@
.
Let's see an example that will help discussing the other macros:
\begincomment
foo
\endcomment
After doing the assignments of category codes and to \endlinechar
, \comment@
is found. The end-of-line hasn't yet be tokenized, but it will be now, when \comment@
looks for its argument. In this case it's empty, so
\comment@@!endcomment\comment@@@
is placed in the input stream (I denote by !endcomment
the eleven characters described above). The (empty) argument is discarded and the input stream will have
\futurelet\next\comment@@@\comment@@@
Then \next
is set equal to \comment@@@
and \comment@@@
is expanded. Its definition is to collect as an argument whatever comes along up to the first \comment@@@
token. In this case, nothing. It then examines the meaning of \next
.
In this case \next
is the same as \comment@@@
, so \next
is set equal to \comment@
and the false branch is skipped over. Then \next
, that is \comment@
is executed, which will see
foo^^J
and do essentially the same (try following it). After this,
\comment@!endcomment^^J
remains on the input stream. This is transformed into
\comment@@!endcomment!endcomment\comment@@@
and the argument to \comment@@
is empty (it wouldn't be if some character preceded \endcomment
on the same line. The first !endcomment
is removed as part of processing the delimited argument and
\futurelet\next\comment@@@!endcomment\comment@@@
remains. Now \next
is set to !
(actually a catcode 12 backslash, of course) and the argument to \comment@@@
is absorbed and discarded. Since \next
is not \comment@@@
, the endgame \oldcodes@\endlinechar`\^^M
is executed.
Important points.
When a macro is defined, the tokens in its parameter text and replacement text are already in internal format and don't depend in any way on subsequent changes of category codes.
When a macro with delimited arguments is expanded, the delimiters are removed from the input stream along with the arguments and the removed tokens will be replaced by the macro's replacement text.
To the contrary, \futurelet
never removes tokens.
When \lowercase
changes a character, it doesn't change the category code. So with
{\lccode`!=`\\ \lowercase{!}}
one gets a category code 12 backslash. Note that the \lccode
of !
will revert to its previous value (usually 0) upon the end of the group. An equivalent way to do the same is
\begingroup\lccode`!=`\\ \lowercase{\endgroup!}
because \lowercase
does not interpret any token it finds; so \endgroup
will be executed after !
has been changed to a backslash.
I'd probably write
\begingroup\lccode`\!=`\\ \lowercase{\endgroup
\def\comment@#1^^J{\comment@@#1!endcomment\comment@@@}%
\def\comment@@#1!endcomment{\futurelet\next\comment@@@}%
}% end of \lowercase
\def\comment@@@#1\comment@@@{%
\ifx\next\comment@@@
\let\next=\comment@
\else
\def\next{\oldcodes@\endlinechar=`\^^M\relax}%
\fi
\next
}
Less braces to keep account of.
Before trying to understand that definition, let us give it a proper indentation and line numbers so it (arguably) is more readable:
01:{%
02: \lccode`\!=`\\
03: \lowercase{%
04: \gdef\comment@#1^^J{\comment@@#1!endcomment\comment@@@}%
05: \gdef\comment@@#1!endcomment{\futurelet\next\comment@@@}%
06: \gdef\comment@@@#1\comment@@@{%
07: \ifx\next\comment@@@
08: \let\next=\comment@
09: \else
10: \def\next{\oldcodes@\endlinechar=`\^^M\relax}%
11: \fi
12: \next}%
13: }%
14:}
It starts by opening a group (line 1) to make the following \lccode
setting local. Then it changes the \lccode
of !
to that of \
, and then issues \lowercase
on the remaining block of code. This is the so-called \lowercase
trick for injecting a character with an unusual catcode in a definition. The contraption:
{\lccode`\!=`\\
\lowercase{<code>}%
}
could be replaced by
\begingroup
\lccode`\!=`\\
\lowercase{%
\endgroup
<code>}
then the \gdef
could be replaced by \def
. But back at the definition. After \lowercase
has done its thing, all tokens who have a non-zero \lccode
were replaced by their lowercase counterparts, and this includes the replacement of the character !
by the character \
(that is, a catcode 12 backslash). I'll show them as \\
in the block of code above to differentiate form the catcode 0 backslashes, but they are actually single \
12 tokens:
04: \gdef\comment@#1^^J{\comment@@#1\\endcomment\comment@@@}%
05: \gdef\comment@@#1\\endcomment{\futurelet\next\comment@@@}%
06: \gdef\comment@@@#1\comment@@@{%
07: \ifx\next\comment@@@
08: \let\next=\comment@
09: \else
10: \def\next{\oldcodes@\endlinechar=`\^^M\relax}%
11: \fi
12: \next}%
Now here's what each of these three commands do: \comment@
grabs everything until the next ^^J
(which, since the \endlinechar
was set to 10, is the line end) and does \comment@@#1\\endcomment\comment@@@
(remember that \\
is a single backslash).
There are two possibilities here: the first is if #1
is a "normal" line of code to be commented, and the second is if #1
is the string \\endcomment
.
Now let's see each possibility separately: if the line doesn't contain \\endcomment
, then \comment@@
will grab everything up to the \\endcomment
left by \comment@
, discard it, and do \futurelet\next\comment@@@
. Since everything to the \\endcomment
was discarded, the next token is another \comment@@@
. Now the \futurelet
does its thing and (the first) \comment@@@
is expanded, and checks if the \next
token is also \comment@@@
. If so, it does \let\next=\comment@
and another line is processed.
When a line that contains \\endcomment
is found, \comment@@
will use that as delimiter, and the \futurelet\next\comment@@@
thing will assign something else to \next
, other than \comment@@@
. In this case, the \ifx
test in \comment@@@
will return false and \next
will be redefined to return the catcodes to normal with \oldcodes@
, and the processing is done.