Giving arbitrary unicode characters, passed as arguments, a math-active definition?
EDIT: the bug has been fixed; I haven't checked when.
You have found a bug in XeTeX's implementation of \scantokens
(the underlying primitive used for LaTeX3's \tl_rescan:nn
) for characters beyond the BMP.
Running the following through (plain) LuaTeX yields (./test.tex ****120162,32**** )
, the rightful character code of followed by that of a space (which follows
#2
in the definition of \test
).
Running it throught (plain) XeTeX yields (./test.tex ****55349,56674**** )
, which are the two pairs of bytes appearing in the UTF-16 representation of (at least they're in the right ballpark). Basically, rescanning transforms
into a pair of characters. Somehow, though,
can safely go through being written to a file and input back: the problem really seems specific to
\scantokens
.
\def\test#1#2.{\message{****\number`#1,\number`#2 ****}}
\scantokens{\test .}
\bye
Please report.
There is already a function for globally assigning a meaning to an active character, without resorting to \tl_rescan:nn
.
\documentclass{article}
\usepackage{xparse}
\usepackage{fontspec}
\setmainfont[Ligatures=TeX]{STIXGeneral}
\ExplSyntaxOn
\cs_new_protected:Npn \my_set_math_active:Nn #1 #2
{
\AtBeginDocument{
\char_set_mathcode:nn {`#1} { "8000 }
}
\group_begin:
\char_gset_active:Npn #1 { #2 }
\group_end:
}
\my_set_math_active:Nn q {(testa)}
\my_set_math_active:Nn {(testb)}
\ExplSyntaxOff
\begin{document}
`q' is used in $q$.
`' is used in $$.
\end{document}
You can't use directly a math active character in its definition, because an infinite loop will result. It has nothing to do with active characters; with the classic
{\catcode`?=\active \xdef?{(\string?)}}
\mathcode`?="8000
the input $?$
would explode even if ?
is not active, because it's math active.
There are workarounds. Here's a way: if you want to use a character in its replacement text when made math active, use \normal
:
\documentclass{article}
\usepackage{xparse}
\usepackage{unicode-math}
\setmainfont[Ligatures=TeX]{STIXGeneral}
\setmathfont{XITS Math}
\ExplSyntaxOn
\cs_new_protected:Npn \helvens_set_math_active:Nn #1 #2
{
\group_begin:
\char_gset_active:Npn #1 { #2 }
\group_end:
\cs_set:cpx { helvens_old_#1 }
{ \Umathcharnum \int_eval:n { \Umathcodenum`#1 } ~ } % a space for terminating the number
\char_set_mathcode:nn {`#1} { "8000 }
}
\NewDocumentCommand{\setmathactive}{mm}
{
\helvens_set_math_active:Nn #1 { #2 }
}
\NewDocumentCommand{\normal}{m}
{
\use:c { helvens_old_#1 }
}
\ExplSyntaxOff
\setmathactive{q}{(\normal{q})}
\setmathactive{}{(\normal{})}
\begin{document}
`q' is used in $q$.
`' is used in $$.
And $\normal{}$ works in math.
\end{document}