Auto replacement of characters

with lualatex:

\documentclass{article}
\directlua
{
 fonts.handlers.otf.addfeature
  {
    name = "shuffle",
    type = "multiple",
    data =
      {
          ["a"] = {"p"},
          ["b"] = {"r"},
          ["c"] = {"s"},                
          ["d"] = {"t"},
      },
  }
}

\usepackage{fontspec}
  \setmainfont{texgyreheros}%
   [
    RawFeature=+shuffle,
   ]


\begin{document}
abcd
\end{document}

enter image description here

OK, it's not quite ready for release yet, but I will use this opportunity to introduce the upcoming tokcycle package (UPDATE: tokcycle package V1.0 has now been released on CTAN, https://www.ctan.org/pkg/tokcycle, released 2019/08/21). It helps you to build tools to process tokens from an input stream. The idea here is that if you can build a macro to process an arbitrary single (non-macro, non-space) token, then tokcycle can provide a wrapper for processing an input stream on a token-by-token basis, using your provided macro.

UPDATE: Christian Tellechea has offered valuable insights into how this package may be improved, and so I am spending more time to implement as many of these improvements as I am able. UPDATE: I have listed Christian as a contributor on the package. He provided bits of code and inspired me to try to make the parsing phase of tokcycle as general as possible.

The package approach is to categorize what comes next in the input stream as either a Character, Group, Macro, or Space. Your job in creating the token-cycle is to specify the LaTeX directives on how to handle each of those four possibilities.

The package provides tools to help you build those directives, whose function it is to process the token stream and place the processed tokens in the output-stream, which is best constituted as a token register (provided by the package as \cytoks).

So let us take the problem at hand. I need to build a macro that can take a single Character token input and provide a mapping to a different character (in a different font). Here is the expandable code I propose for this:

\def\tcmapto#1#2{\expandafter\def\csname tcmapto#1\endcsname{#2}}
\def\tcremap#1{\ifcsname tcmapto#1\endcsname
                 \csname tcmapto#1\endcsname\else#1\fi}


\tcmapto अP
\tcmapto बQ
\tcmapto कR
\tcmapto डS

The mapping need not be to a single token. For example, \tcmapto ब{$\alpha$} is a valid mapping. The macro \tcremap basically says, if I find a remap, use it, otherwise just output the original token. I provide a remap of 4 tokens as shown above.

So now let's get to the tokcycle syntax. It provides a Plain-TeX supported syntax (tokcycle.tex) of macros \tokcycle and \expandedtokcycle and pseudo-environment \tokencycle...\endtokencycle.

The expanded version applies \expanded to the input stream before tokcycle processing (macros can be cordoned off with \noexpand).

Also, it supports xpress versions of these macros/environments, so that repeated invocations can use the most recently specified directives, rather than having to retype the directives each time.

Finally, there is the means provided, via \tokcycleenvironment<\environmentname>{}{}{}{} to create a more permanent environment with its directives locked in place.

For this MWE (using LuaLaTeX), we will use

\tokencycle
  {<Character directive>}
  {<Group directive>}
  {<Macro directive>}
  {<Space directive>}%
<token input stream>
\endtokencycle

Now for the code. First, the code to address the OP's problem. It makes use, in the Character directive, of the new \expanded TeX primitive, in the form of the macro \addcytoks[x]{\tcremap{#1}}, which will fully expand the remap before appending the result to the \cytoks token register. If your engine still does not support \expanded, you can for this case replace that macro with \edef\tmp{\tcremap{#1}}\addcytoks[1]{\tmp}.

The other directives are just the package defaults: Group Content is recommitted to the token cycle on a per token basis (\processtoks), macros are echoed literally into the output stream (\addcytoks), as are spaces.

\documentclass{article}
\usepackage{tokcycle}
\def\tcmapto#1#2{\expandafter\def\csname tcmapto#1\endcsname{#2}}
\def\tcremap#1{\ifcsname tcmapto#1\endcsname
                 \csname tcmapto#1\endcsname\else#1\fi}
\tcmapto अP
\tcmapto बQ
\tcmapto कR
\tcmapto डS
\begin{document}
%अबकड
\verb|\tcremap| handles a single token: \tcremap{अ}.

\verb|\tokencycle|  and \verb|\tokcycle| handle a stream of 
  such tokens, including embedded macros.

\noindent\hrulefill

PSEUDO ENVIRONMENT

\tokencycle
{\addcytoks[x]{\tcremap{#1}}}
{\processtoks{#1}}
{\addcytoks{#1}}
{\addcytoks{#1}}
अबकड डड  \textit{बकअ} कड.

Other text for which no mapping is yet given as of \today.

अबक done.
\endtokencycle
\end{document}

enter image description here

Really, really bad idea :-)

You can make the characters you want to remap active and redefine them to print some other characters. However if you make, say, a active, then \newcommand (for example) won't work anymore because TeX will understand it as \newcomm and, where a gets replaced by a p.

I defined a command \remap which takes to characters and makes the first one print the second one, and another \remapchars command which should contain the \remap instructions and then typesets the argument with the new settings and reverts them afterwards to avoid problems. Do not use \remap “on the loose”. You've been warned. here you go:

\documentclass{article}

\newcommand\remap[2]{%
  \catcode`#1=\active
  \begingroup
  \lccode`~=`#1%
  \lowercase{\endgroup\def~{\char`#2}}\ignorespaces}
\newcommand\remapchars{%
  \begingroup
  \remap ap
  \remap bq
  \remap cr
  \remap ds
  \innerremap}
\newcommand\innerremap[1]{#1\endgroup}

\begin{document}

`abcd'\remapchars{`abcd'}`abcd'

\end{document}

Auto replacement of characters

Tags:

Lua

Xetex

Related

Recent Posts