Automated sub- and superscripts

I believe that a syntax such as \mynewcommand{a}{bc}{de} would be clearer. Anyway, I can offer two implementations that differ in the treatment of spaces after the superscript and before the subscript. Take your pick.

\documentclass{article}

%\usepackage{xparse} % not needed for LaTeX 2020-10-01

\ExplSyntaxOn
\NewDocumentCommand{\mynewcommandA}{m}
 {
  \textsuperscript{\tl_range:nnn { #1 } { 1 } { 1 } }
  \tl_range:nnn { #1 } { 2 } { -3 }
  \textsubscript{\tl_range:nnn { #1 } { -2 } { -1 } }
 }

\NewDocumentCommand{\mynewcommandB}{m}
 {
  \tl_set:Nn \l_tmpa_tl { #1 }
  \tl_replace_all:Nnn \l_tmpa_tl { ~ } { \c_space_tl }
  \textsuperscript{\tl_range:Nnn \l_tmpa_tl { 1 } { 1 } }
  \tl_range:Nnn \l_tmpa_tl { 2 } { -3 }
  \textsubscript{\tl_range:Nnn \l_tmpa_tl { -2 } { -1 } }
}
\ExplSyntaxOff

\begin{document}

\textbf{Leading and trailing spaces are not kept}

\mynewcommandA{abcde}

\mynewcommandA{a some text that can contain \textit{other commands} cd}

\bigskip

\textbf{Leading and trailing spaces are kept}

\mynewcommandB{abcde}

\mynewcommandB{a some text that can contain \textit{other commands} cd}

\end{document}

enter image description here

Some more information. The function \tl_range:nnn takes three arguments where the first is some text, the second and third are integers that specify the range to extract; so {1}{1} extracts the first item (it can also be \tl_head:n, but I used the more complex function for uniformity), whereas {-2}{-1} specifies the last two items (with negative indices the extraction starts from the end); {2}{-3} specifies the range from the second item to the third starting from the right.

However, in order to keep spaces at the boundaries of the extracted parts, we have to first replace spaces with \c_space_tl, which will expand to a space, but is not trimmed by the extraction functions. The syntax of \tl_set:Nnn is the same, only the first argument has to be a tl variable.


For the sake of variety, here's a LuaLaTeX-based solution. It sets up a Lua function which, in turn, makes use of Lua's string functions string.sub and string.len to accomplish its task. It also sets up a LaTeX "wrapper" macro called \mynewcommand, which expands its argument once before passing it to the Lua function.

The solution actually employs variants of the Lua string functions, unicode.utf8.sub and unicode.utf8.len, to allow the argument of \mynewcommand to be any valid string of utf8-encoded characters. (Of course, in order to print the characters in the string, a suitable font has to be loaded.) The argument of \mynewcommand may contain primitives and macros.

enter image description here

% !TEX TS-program = lualatex
\documentclass{article}
\usepackage{luacode} % for "\luaexec" and "\luastringO" macros
\luaexec{
% Define a Lua function called "mycommand"
function mycommand ( s )
   local s1,s2,s3
   s1  = unicode.utf8.sub ( s, 1, 1 )
   s2  = unicode.utf8.sub ( s, 2, unicode.utf8.len(s)-2 )
   s3  = unicode.utf8.sub ( s, -2 )
   return ( "\\textsuperscript{" ..s1.. "}" ..s2.. "\\textsubscript{" ..s3.. "}" )
end
}
% Create a wrapper macro for the Lua function
\newcommand\mynewcommand[1]{\directlua{tex.sprint(mycommand(\luastringO{#1}))}}
    
\begin{document}
abcde $\to$ \mynewcommand{abcde}

öçäßüéà $\to$ \mynewcommand{öçäßüéà}

\mynewcommand{a some text that can contain \textit{\textbf{other commands}} cd}
\end{document}

For sake of complexity, I show how to solve this problem at TeX primitive level:

\newcount\bufflen

\def\splitbuff #1#2{% #1: number of tokens from end, #2 data
                    % result: \buff, \restbuff
    \edef\buff{\detokenize{#2} }%
    \edef\buff{\expandafter}\expandafter\protectspaces \buff \\
    \bufflen=0 \expandafter\setbufflen\buff\end
    \advance\bufflen by-#1\relax
    \ifnum\bufflen<0 \errmessage{#1>buffer length}\fi
    \ifnum\bufflen>0 \edef\buff{\expandafter}\expandafter\splitbuffA \buff\end
    \else \let\restbuff=\buff \def\buff{}\fi
    \edef\tmp{\gdef\noexpand\buff{\buff}\gdef\noexpand\restbuff{\restbuff}}%
    {\endlinechar=-1 \scantokens\expandafter{\tmp}}%
}

\def\protectspaces #1 #2 {\addto\buff{#1}%
    \ifx\\#2\else \addto\buff{{ }}\afterfi \protectspaces #2 \fi}  
\def\afterfi #1\fi{\fi#1}
\long\def\addto#1#2{\expandafter\def\expandafter#1\expandafter{#1#2}}

\def\setbufflen #1{%
    \ifx\end#1\else \advance\bufflen by1 \expandafter\setbufflen\fi}

\def\splitbuffA #1{\addto\buff{#1}\advance\bufflen by-1
    \ifnum\bufflen>0 \expandafter\splitbuffA
    \else \expandafter\splitbuffB \fi
}
\def\splitbuffB #1\end{\def\restbuff{#1}}

% --------------- \mynewcommand implementation:

\def\textup#1{$^{\rm #1}$}  \def\textdown#1{$_{\rm #1}$}
\def\mynewcommand#1{\mynewcommandA#1\end}
\def\mynewcommandA#1#2\end{%
   \textup{#1}\splitbuff 2{#2}\buff \textdown{\restbuff}}

% --------------- test:

\mynewcommand{abcde}

\mynewcommand{a some text that can contain {\it other commands} cd}

\bye

Tags:

Macros