Command to count characters in a specified string
If your argument contains macros, the answer would need to change. Spaces count as characters, though that could be adjusted if you desired.
\documentclass{article}
\usepackage{stringstrings}
\newcommand{\numchars}[1]{\noindent The string ``#1'' has \stringlength{#1} characters.\\}
\begin{document}
\numchars{everything}
\numchars{that's not it!}
\numchars{weird}
\end{document}
Here's a version that does not count spaces.
\documentclass{article}
\usepackage{stringstrings}
\newcommand{\numchars}[1]{%
\convertchar[q]{#1}{ }{}%
\noindent The string ``#1'' has \stringlength{\thestring} characters.\\
}
\begin{document}
\numchars{everything}
\numchars{that's not it!}
\numchars{weird}
\end{document}
And if you wanted to count only alphabetic characters (ignoring numbers, spaces and punctuation)
\documentclass{article}
\usepackage{stringstrings}
\newcommand{\numchars}[1]{%
\convertchar[q]{#1}{ }{}%
\alphabetic[q]{\thestring}%
\noindent The string ``#1'' has \stringlength{\thestring} characters.\\
}
\begin{document}
\numchars{everything}
\numchars{that's not it!}
\numchars{weird}
\end{document}
Even though the OP has stated that he/she isn't interested in a LuaLaTeX-based solution, others may still value having such a solution. :-)
The following solution works with strings of UTF8-encoded characters. Because ASCII-encoded characters are automatically UTF8-encoded, the solution also works with ASCII-encoded strings.
% !TEX TS-program = lualatex
\documentclass{article}
\usepackage{fontspec}
\usepackage{luacode} % for "\luastring" macro
\newcommand{\numchars}[1]{\noindent The string ``#1'' has
\directlua{tex.sprint(unicode.utf8.len(\luastring{#1}))}
characters.\par}
\begin{document}
\numchars{everything}
\numchars{öüß}
\end{document}
Aside: If the Lua-side code inappropriately used the function string.len
instead of unicode.utf8.len
, the macro \numchars
would report that öüß
has 6 characters. This happens because each of the 3 characters in öüß
is encoded using 2 bytes in the utf8 system. (The function str.len
does a byte count rather than a direct character account; that's OK if each character is encoded using exactly 1 byte, which is the case for the ASCII encoding system, though not for most others.) Likewise, the string ø§¶®€œ¥√DZ
would incorrectly be diagnosed as having 22 [!] rather than just 10 characters, as both €
and √
are encoded using 3 bytes and the remaining 8 characters are encoded using 2 bytes each. Clearly, it's important to use the function unicode.utf8.len
in the present context.
The command \newcommand{\numchars}[1]
... works well, but I encountered some issues with \stringlength
in the stringstrings
package. It seems like it has a limit of 500 on the number of characters, returning zero if you go above that. For example, the code:
\documentclass[11pt]{amsart}
\usepackage{stringstrings}
\newcommand{\numchars}[1]{\noindent The string ``#1'' has \stringlength{#1} characters.\\}
\begin{document}
\numchars{Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Ut purus elit, vestibulum ut, placerat ac, adipiscing vitae, felis. Curabitur dictum gravida mauris. Nam arcu libero, nonummy eget, consectetuer id, vulputate a, magna. Donec vehicula augue eu neque. Pellentesque habitant morbi tris- tique senectus et netus et malesuada fames ac turpis egestas. Mauris ut leo. Cras viverra metus rhoncus sem. Nulla et lectus vestibulum urna fringilla ultrices. Phasellus eu tellus sit amet tortor gravida placerat. Integer sapien est, iaculis in, pretium quis, viverra ac, nunc. Praesent eget sem vel leo ultrices bibendum. Aenean faucibus. Morbi dolor nulla, malesuada eu, pul- vinar at, mollis ac, nulla. Curabitur auctor semper nulla. Donec varius orci eget risus. Duis nibh mi, congue eu, accumsan eleifend, sagittis quis, diam. Duis eget orci sit amet orci dignissim rutrum.}
\end{document}
Returns:
The command \StrLen
in the xstring
package seems to work better. The document:
\documentclass[11pt]{amsart}
\usepackage{xstring}
\newcommand{\numchars}[1]{\noindent The string ``#1'' has {\StrLen{#1}} characters.\\}
\begin{document}
\numchars{Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Ut purus elit, vestibulum ut, placerat ac, adipiscing vitae, felis. Curabitur dictum gravida mauris. Nam arcu libero, nonummy eget, consectetuer id, vulputate a, magna. Donec vehicula augue eu neque. Pellentesque habitant morbi tris- tique senectus et netus et malesuada fames ac turpis egestas. Mauris ut leo. Cras viverra metus rhoncus sem. Nulla et lectus vestibulum urna fringilla ultrices. Phasellus eu tellus sit amet tortor gravida placerat. Integer sapien est, iaculis in, pretium quis, viverra ac, nunc. Praesent eget sem vel leo ultrices bibendum. Aenean faucibus. Morbi dolor nulla, malesuada eu, pul- vinar at, mollis ac, nulla. Curabitur auctor semper nulla. Donec varius orci eget risus. Duis nibh mi, congue eu, accumsan eleifend, sagittis quis, diam. Duis eget orci sit amet orci dignissim rutrum.}
\end{document}
Returns: