Removing all whitespace characters except for " "

Try using this regular expression:

[^\S ]+

It's a bit confusing to read because of the double negative. The regular expression [\S ] matches the characters you want to keep, i.e. either a space or anything that isn't a whitespace. The negated character class [^\S ] therefore must match all the characters you want to remove.


Using a Guava CharMatcher:

String text = ...
String stripped = CharMatcher.WHITESPACE.and(CharMatcher.isNot(' '))
    .removeFrom(text);

If you actually just want that trimmed from the start and end of the string (like String.trim()) you'd use trimFrom rather than removeFrom.


There's no subtraction of character classes in Java, otherwise you could use [\s--[ ]], note the double dash. You can always simulate set subtraction using intersection with the complement, so

[\s&&[^ ]]

should work. It's no better than [^\S ]+ from the first answer, but the principle is different and it's good to know both.

Tags:

Java

Regex