Extract the numerical and non-numerical portion from text
An approach using the LaTeX3 l3regex
module
\documentclass{article}
\usepackage[T1]{fontenc}
\usepackage{array,booktabs,expl3,l3regex}
\ExplSyntaxOn
\tl_new:N \l_extract_tl
\regex_set:Nn \l_extract_tl { ^\s*([+-]?\d*\.?\d*)\s*(.*) }
\seq_new:N \l_extract_seq
\tl_new:N \NumberValue
\tl_new:N \OtherValue
\cs_new_protected:Npn \extract_number:n #1
{
\regex_extract_once:NnN \l_extract_tl {#1} \l_extract_seq
\tl_gset:Nx \NumberValue { \seq_item:Nn \l_extract_seq { 2 } }
\tl_gset:Nx \OtherValue { \seq_item:Nn \l_extract_seq { 3 } }
}
\cs_new_protected:Npn \Test #1
{
\extract_number:n {#1}
& \detokenize{#1} & \NumberValue & \OtherValue
}
\ExplSyntaxOff
\begin{document}
\begin{tabular}{l>{\ttfamily}r>{\ttfamily}r>{\ttfamily}r}
\toprule
& \multicolumn{1}{r}{Input} &
\multicolumn{1}{r}{Digit} & \multicolumn{1}{r}{Non-digit} \\
\midrule
Decimal: \Test{ 1.01abc} \\
\Test{+2.01abc} \\
\Test{-3.01abc} \\
\midrule
Integer: \Test{ abc} \\
\Test{ 5abc} \\
\Test{+6abc} \\
\Test{-7abc} \\
\midrule
Floating Point: \Test{ 5.34abc} \\
\Test{+6.34abc} \\
\Test{-7.34abc} \\
\midrule
Number Only: \Test{3} \\
\Test{3.2} \\
\Test{-5.1} \\
\Test{+5.1} \\
\midrule
No Digits: \Test{abc} \\
\midrule
Formatted Text: \Test{ 8$abc_1$} \\
\Test{-8.2$abc_1$} \\
\Test{+$abc_1$} \\
\Test{$abc_1$} \\
\bottomrule
\end{tabular}
\end{document}
Currently, this module is 'experimental' hence loading it separately from expl3
, but I'd expect it to move to 'kernel' in the near-ish future (before the end of the year).
The way that this works is that when we do a regular expression match, the capturing groups are stored in a sequence indexed from 0 (the complete match) upward. So I've got the first capture group as the numerical part and the second as the non-numerical. Notice that I've also included \s*
to remove any leading spaces from those two groups: if you miss that out then you'll also pick up the spaces as part of the match.
Also notice that the results here are detokenized, so if you want to have formatted text you'd need to \scantokens
the results. (Something as simple as \scantokens\expandafter{\OtherValue}
would do here.)
If you can use luatex, you may use a proper parser (the code below is in ConTeXt, just because I don't know all the details of using luatex in LaTeX).
\startluacode
local P, R, S, V, match = lpeg.P, lpeg.R, lpeg.S, lpeg.V, lpeg.match
local Ct, C, Cs, Cc = lpeg.Ct, lpeg.C, lpeg.Cs, lpeg.Cc
local format = string.format
local digit = R("09")
local sign = S('+-')
local integer = sign^0 * digit^0 -- NOTE: I'd rather use digit^1, but
-- the requirements want to capture a
-- single sign as well
local float = sign^0 * digit^0 * P('.') * digit^1
local space = P(" ")^0
local number = Cs(float + integer)
local any = Cs(P(1)^0)
local number_value = Cc("\\global\\def\\NumberValue{%s}") * number / format
local other_value = Cc("\\global\\def\\OtherValue{%s}") * any / format
local parser = Cs(space * number_value * other_value)
function commands.extract_number(s)
context(match(parser,s))
end
\stopluacode
\unprotect
\def\extract#1%
{\let\NumberValue\relax
\let\OtherValue \relax
\ctxcommand{extract_number(\!!bs\detokenize{#1}\!!es)}}
\protect
You can then use this as follows.
\def\Test#1%
{\extract{#1}%
#1 \NC \NumberValue \NC \OtherValue}
\starttext
\starttabulate[|l|r|r|r|]
\HL
\NC \NC Input \NC Digit \NC Non-Digit \NC \NR
\HL
\NC Decimal: \NC \Test{ 1.01abc} \NC \NR
\NC \NC \Test{+2.01abc} \NC \NR
\NC \NC \Test{-3.01abc} \NC \NR
\HL
\NC Integer: \NC \Test{ abc} \NC \NR
\NC \NC \Test{ 5abc} \NC \NR
\NC \NC \Test{+6abc} \NC \NR
\NC \NC \Test{-7abc} \NC \NR
\HL
\NC Floating Point: \NC \Test{ 5.34abc} \NC \NR
\NC \NC \Test{+6.34abc} \NC \NR
\NC \NC \Test{-7.34abc} \NC \NR
\HL
\NC Number Only: \NC \Test{3} \NC \NR
\NC \NC \Test{3.2} \NC \NR
\NC \NC \Test{-5.1} \NC \NR
\NC \NC \Test{+5.1} \NC \NR
\HL
\NC No Digits: \NC \Test{abc} \NC \NR
\HL
\NC Formatted Text: \NC \Test{ 8$abc_1$} \NC \NR
\NC \NC \Test{-8.2$abc_1$} \NC \NR
\NC \NC \Test{+$abc_1$} \NC \NR
\NC \NC \Test{$abc_1$} \NC \NR
\HL
\stoptabulate
\stoptext
which gives
Here is a solution with xstring
:
\documentclass[border=2pt]{standalone}
\usepackage{booktabs}
\usepackage{xstring}
\makeatletter
% first, need to fix a bug in xstring:
\@xs@newmacro\IfDecimal{}{1}{0}{%
\@xs@formatnumber{#1}\@xs@reserved@A
\decimalpart\z@
\afterassignment\@xs@defafterinteger\integerpart\@xs@reserved@A\relax\@xs@nil
\expandafter\@xs@testdot\@xs@afterinteger\@xs@nil
\ifx\@empty\@xs@afterdecimal\expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi}
\newcommand*\Test[1]{%
\IfBeginWith{#1}{ }{\StrBehind{#1}{ }[\temp@@]}{\def\temp@@{#1}}%
\IfDecimal\temp@@
{\def\temp@{#1&}}
{\def\temp@{#1&}%
\StrBefore{#1}\@xs@afterdecimal[\temp@@]%
\expandafter\g@addto@macro\expandafter\temp@\expandafter{\temp@@&}%
\expandafter\g@addto@macro\expandafter\temp@\expandafter{\@xs@afterdecimal}%
}%
\temp@\\}
\makeatother
\begin{document}
\begin{tabular}{l r r r}
& &Number &Non-Digits\\
\midrule
Decimal:
&\Test{ 1.01abc}
&\Test{+2.01abc}
&\Test{-3.01abc}
\midrule
Integer:
&\Test{ abc}
&\Test{ 5abc}
&\Test{+6abc}
&\Test{-7abc}
\midrule
Floating Point:
&\Test{ 5.34abc}
&\Test{+6.34abc}
&\Test{-7.34abc}
\midrule
Number Only:
&\Test{3}
&\Test{3.2}
&\Test{-5.1}
&\Test{+5.1}
\midrule
No Digits:
&\Test{abc}
\midrule
Formatted Text:
&\Test{ 8$abc_1$}
&\Test{-8.2$abc_1$}
&\Test{+$abc_1$}
&\Test{$abc_1$}
\end{tabular}
\end{document}
EDIT: here is how to do with \ExtractLeadingNumber
and \ExtractTralingNonDigits
\makeatletter
% first, need to fix a bug in xstring:
\@xs@newmacro\IfDecimal{}{1}{0}{%
\@xs@formatnumber{#1}\@xs@reserved@A
\decimalpart\z@
\afterassignment\@xs@defafterinteger\integerpart\@xs@reserved@A\relax\@xs@nil
\expandafter\@xs@testdot\@xs@afterinteger\@xs@nil
\ifx\@empty\@xs@afterdecimal\expandafter\@firstoftwo\else\expandafter\@secondoftwo\fi}
\newcommand*\ExtractLeadingNumber[1]{%
\IfBeginWith{#1}{ }{\StrBehind{#1}{ }[\temp@@]}{\def\temp@@{#1}}%
\IfDecimal\temp@@{#1}{\StrBefore{#1}\@xs@afterdecimal}%
}
\newcommand*\ExtractTralingNonDigits[1]{%
\IfBeginWith{#1}{ }{\StrBehind{#1}{ }[\temp@@]}{\def\temp@@{#1}}%
\IfDecimal\temp@@{}\@xs@afterdecimal
}
\makeatother
\newcommand*\Test[1]{#1&\ExtractLeadingNumber{#1}&\ExtractTralingNonDigits{#1}\\}