Java StringTokenizer.nextToken() skips over empty fields
I would use Guava's Splitter, which doesn't need all the big regex machinery, and is more well-behaved than String's split()
method:
Iterable<String> parts = Splitter.on('\t').split(string);
There is a RFE in the Sun's bug database about this StringTokenizer
issue with a status Will not fix
.
The evaluation of this RFE states, I quote:
With the addition of the
java.util.regex
package in1.4.0
, we have basically obsoleted the need forStringTokenizer
. We won't remove the class for compatibility reasons. Butregex
gives you simply what you need.
And then suggests using String#split(String)
method.
Thank you at all. Due to the first comment I was able to find a solution: Yes you are right, thank you for your reference:
Scanner s = new Scanner(new File("data.txt"));
while (s.hasNextLine()) {
String line = s.nextLine();
String[] items= line.split("\t", -1);
System.out.println(items[5]);
//System.out.println(Arrays.toString(cols));
}
You can use Apache
Commons StringUtils.splitPreserveAllTokens(). It does exactly what you need.