Extract strings containing digit characters with StringCases

StringCases[string, DigitCharacter .. ~~ ", " ~~ w : (LetterCharacter ..) :> w]

{{"TEXTDATA", "NEXTTEXTDATA"}}

StringCases[string, NumberString ~~ ", " ~~ w : (LetterCharacter ..) :> w]

{{"TEXTDATA", "NEXTTEXTDATA"}}

StringCases[string, ", " ~~ w : (LetterCharacter ..) ~~ EndOfString | "," :> w]

{{"TEXTDATA", "NEXTTEXTDATA"}}

For the moment, it seems to suffice just to understand the selection criterion as to find all the substrings consisting of upper letters. To convey that information, one can use RegularExpression or CharacterRange to construct the string pattern.

string = {"text, 1998, TEXTDATA, text, 2007, NEXTTEXTDATA"};
stringpattern1 = RegularExpression["[[:upper:]]+"];
stringpattern2 = CharacterRange["A", "Z"] ..;
StringCases[string, stringpattern1]
StringCases[string, stringpattern2]

both with the same result

{{"TEXTDATA", "NEXTTEXTDATA"}}

If one has to work by specifying the environment of the substrings rather than the information of the target substrings themselves, the string pattern is also accessible as

stringpattern3 = RegularExpression["\\d, ([^ ,]+),?"] -> "$1";

which is saying "picking out whatever parenthesized characters except a blank or a comma, following a substring consisting of a digit, a comma, and a blank, and simultaneously followed by a comma or nothing (no more than one comma)".

string = {"text, 1998, TEXTDATA, text, 2007, NEXTTEXTDATA"};
list = StringSplit[string[[1]], ", "];
filter = RotateRight[StringMatchQ[list, NumberString]];

Pick[list, filter]

{"TEXTDATA", "NEXTTEXTDATA"}

Extract strings containing digit characters with StringCases

Tags:

String Manipulation

Related

Recent Posts