Flattening a collection
If you are using Java 8, you could do something like this:
someMap.values().forEach(someList::addAll);
When searching for "java 8 flatten" this is the only mentioning. And it's not about flattening stream either. So for great good I just leave it here
.flatMap(Collection::stream)
I'm also surprised no one has given concurrent java 8 answer to original question which is
.collect(ArrayList::new, ArrayList::addAll, ArrayList::addAll);
Using Java 8 and if you prefer not to instantiate a List
instance by yourself, like in the suggested (and accepted) solution
someMap.values().forEach(someList::addAll);
You could do it all by streaming with this statement:
List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
By the way it should be interesting to know, that on Java 8 the accepted version seems to be indeed the fastest. It has about the same timing as a
for (List<String> item : someMap.values()) ...
and is a way faster than the pure streaming solution. Here is my little testcode. I explicitly don't name it benchmark to avoid the resulting discussion of benchmark flaws. ;) I do every test twice to hopefully get a full compiled version.
Map<String, List<String>> map = new HashMap<>();
long millis;
map.put("test", Arrays.asList("1", "2", "3", "4"));
map.put("test2", Arrays.asList("10", "20", "30", "40"));
map.put("test3", Arrays.asList("100", "200", "300", "400"));
int maxcounter = 1000000;
System.out.println("1 stream flatmap");
millis = System.currentTimeMillis();
for (int i = 0; i < maxcounter; i++) {
List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
}
System.out.println(System.currentTimeMillis() - millis);
System.out.println("1 parallel stream flatmap");
millis = System.currentTimeMillis();
for (int i = 0; i < maxcounter; i++) {
List<String> someList = map.values().parallelStream().flatMap(c -> c.stream()).collect(Collectors.toList());
}
System.out.println(System.currentTimeMillis() - millis);
System.out.println("1 foreach");
millis = System.currentTimeMillis();
for (int i = 0; i < maxcounter; i++) {
List<String> mylist = new ArrayList<String>();
map.values().forEach(mylist::addAll);
}
System.out.println(System.currentTimeMillis() - millis);
System.out.println("1 for");
millis = System.currentTimeMillis();
for (int i = 0; i < maxcounter; i++) {
List<String> mylist = new ArrayList<String>();
for (List<String> item : map.values()) {
mylist.addAll(item);
}
}
System.out.println(System.currentTimeMillis() - millis);
System.out.println("2 stream flatmap");
millis = System.currentTimeMillis();
for (int i = 0; i < maxcounter; i++) {
List<String> someList = map.values().stream().flatMap(c -> c.stream()).collect(Collectors.toList());
}
System.out.println(System.currentTimeMillis() - millis);
System.out.println("2 parallel stream flatmap");
millis = System.currentTimeMillis();
for (int i = 0; i < maxcounter; i++) {
List<String> someList = map.values().parallelStream().flatMap(c -> c.stream()).collect(Collectors.toList());
}
System.out.println(System.currentTimeMillis() - millis);
System.out.println("2 foreach");
millis = System.currentTimeMillis();
for (int i = 0; i < maxcounter; i++) {
List<String> mylist = new ArrayList<String>();
map.values().forEach(mylist::addAll);
}
System.out.println(System.currentTimeMillis() - millis);
System.out.println("2 for");
millis = System.currentTimeMillis();
for (int i = 0; i < maxcounter; i++) {
List<String> mylist = new ArrayList<String>();
for (List<String> item : map.values()) {
mylist.addAll(item);
}
}
System.out.println(System.currentTimeMillis() - millis);
And here are the results:
1 stream flatmap
468
1 parallel stream flatmap
1529
1 foreach
140
1 for
172
2 stream flatmap
296
2 parallel stream flatmap
1482
2 foreach
156
2 for
141
Edit 2016-05-24 (two years after):
Running the same test using an actual Java 8 version (U92) on the same machine:
1 stream flatmap
313
1 parallel stream flatmap
3257
1 foreach
109
1 for
141
2 stream flatmap
219
2 parallel stream flatmap
3830
2 foreach
125
2 for
140
It seems that there is a speedup for sequential processing of streams and an even larger overhead for parallel streams.
Edit 2018-10-18 (four years after):
Using now Java 10 version (10.0.2) on the same machine:
1 stream flatmap
393
1 parallel stream flatmap
3683
1 foreach
157
1 for
175
2 stream flatmap
243
2 parallel stream flatmap
5945
2 foreach
128
2 for
187
The overhead for parallel streaming seems to be larger.
Edit 2020-05-22 (six years after):
Using now Java 14 version (14.0.0.36) on a different machine:
1 stream flatmap
299
1 parallel stream flatmap
3209
1 foreach
202
1 for
170
2 stream flatmap
178
2 parallel stream flatmap
3270
2 foreach
138
2 for
167
It should really be noted, that this was done on a different machine (but I think comparable). The parallel streaming overhead seems to be considerably smaller than before.