Add a space after a ending dot of a sentence
For latest requirement:
rule = RegularExpression["(?!\\d\\.\\d)(\\w\\.+)(?! )(\\w?)"] -> "$1 $2";
StringReplace[rule] @ {text1, text2}
{"This is a sample text. Just 1.2 to test. To add a space after the dot... Okay. ", "I watched football. 10 people played in 2 teams, my friend was player number 7. 20 minutes later the game ended with the score 2:1. Then I went home. "}
"Negative Lookahead" ("(?!...)"
) is used, BTW.
Older-er-er response
StringReplace[text, RegularExpression["(\\w\\.)(\\w)"] -> "$1 $2"]
"\\w"
means word characters, including letters, digits and the underscore _
. "\\."
means a period/dot literally. So the regular expression means to find a string pattern with length three: a word character followed by a period and followed by a word character.
Parentheses mean a group, and "$n"
where n
is an integer represents the contents in the n
-th group.
So the whole operation is to add a blank between the two groups after locating them by the string pattern.
StringReplace[text, "." ~~a:Except[DigitCharacter|WhitespaceCharacter|"."] :> ". "<> a]
{"This is a sample text. Just 1.2 to test. To add a space after the dot... Okay."}
Also
StringReplace[text, a:LetterCharacter|"."~~"." ~~b:LetterCharacter:> a<>". "<>b ]
same result
StringReplace[StringReplace[text, "." -> ". "], ". " ~~ EndOfString -> "."]
{"This is a sample text. Just to test. To add a space after the dot. Okay."}