Best way to specify whitespace in a String.Split operation
Yes, There is need for one more answer here!
All the solutions thus far address the rather limited domain of canonical input, to wit: a single whitespace character between elements (though tip of the hat to @cherno for at least mentioning the problem). But I submit that in all but the most obscure scenarios, splitting all of these should yield identical results:
string myStrA = "The quick brown fox jumps over the lazy dog";
string myStrB = "The quick brown fox jumps over the lazy dog";
string myStrC = "The quick brown fox jumps over the lazy dog";
string myStrD = " The quick brown fox jumps over the lazy dog";
String.Split
(in any of the flavors shown throughout the other answers here) simply does not work well unless you attach the RemoveEmptyEntries
option with either of these:
myStr.Split(new char[0], StringSplitOptions.RemoveEmptyEntries)
myStr.Split(new char[] {' ','\t'}, StringSplitOptions.RemoveEmptyEntries)
As the illustration reveals, omitting the option yields four different results (labeled A, B, C, and D) vs. the single result from all four inputs when you use RemoveEmptyEntries
:
Of course, if you don't like using options, just use the regex alternative :-)
Regex.Split(myStr, @"\s+").Where(s => s != string.Empty)
If you just call:
string[] ssize = myStr.Split(null); //Or myStr.Split()
or:
string[] ssize = myStr.Split(new char[0]);
then white-space is assumed to be the splitting character. From the string.Split(char[])
method's documentation page.
If the separator parameter is
null
or contains no characters, white-space characters are assumed to be the delimiters. White-space characters are defined by the Unicode standard and returntrue
if they are passed to theChar.IsWhiteSpace
method.
Always, always, always read the documentation!