Java 8 Parallel Stream Concurrent Grouping
You can either chain your grouping collectors which would give you a multi-level map. However, this is not ideal if you want to group by say more than 2 fields.
The better option would be to override the equals
and hashcode
methods within your Person
class to define the equality of two given objects which in this case would be all the said fields. Then you can group by Person
i.e groupingByConcurrent(Function.identity())
in which case you'll end up with:
ConcurrentMap<Person, List<Person>> resultSet = ....
Example:
class Person {
@Override
public boolean equals(Object o) {
if (this == o) return true;
if (o == null || getClass() != o.getClass()) return false;
Person person = (Person) o;
if (name != null ? !name.equals(person.name) : person.name != null) return false;
if (uid != null ? !uid.equals(person.uid) : person.uid != null) return false;
return phone != null ? phone.equals(person.phone) : person.phone == null;
}
@Override
public int hashCode() {
int result = name != null ? name.hashCode() : 0;
result = 31 * result + (uid != null ? uid.hashCode() : 0);
result = 31 * result + (phone != null ? phone.hashCode() : 0);
return result;
}
private String name;
private String uid; // these should be private, don't expose
private String phone;
// getters where necessary
// setters where necessary
}
then:
ConcurrentMap<Person, List<Person>> resultSet = list.parallelStream()
.collect(Collectors.groupingByConcurrent(Function.identity()));
You can do that by using the of
static factory method from Collector
:
Map<String, Set<Person>> groupBy = persons.parallelStream()
.collect(Collector.of(
ConcurrentHashMap::new,
( map, person ) -> {
map.computeIfAbsent(person.name, k -> new HashSet<>()).add(person);
map.computeIfAbsent(person.uid, k -> new HashSet<>()).add(person);
map.computeIfAbsent(person.phone, k -> new HashSet<>()).add(person);
},
( a, b ) -> {
b.forEach(( key, set ) -> a.computeIfAbsent(key, k -> new HashSet<>()).addAll(set));
return a;
}
));
As Holger in the comments suggested, following approach can be preferred over the above one:
Map<String, Set<Person>> groupBy = persons.parallelStream()
.collect(HashMap::new, (m, p) -> {
m.computeIfAbsent(p.name, k -> new HashSet<>()).add(p);
m.computeIfAbsent(p.uid, k -> new HashSet<>()).add(p);
m.computeIfAbsent(p.phone, k -> new HashSet<>()).add(p);
}, (a, b) -> b.forEach((key, set) -> {
a.computeIfAbsent(key, k -> new HashSet<>()).addAll(set));
});
It uses the overloaded collect
method which acts identical to my suggested statement above.