Java stream merge or reduce duplicate objects
The groupingBy
operation (or something similar) is unavoidable, the Map
created by the operation is also used during the operation for looking up the grouping keys and finding the duplicates. But you can combine it with the reduction of the group elements:
Map<String, Friend> uniqueFriendMap = friends.stream()
.collect(Collectors.groupingBy(Friend::uniqueFunction,
Collectors.collectingAndThen(
Collectors.reducing((a,b) -> friendMergeFunction(a,b)), Optional::get)));
The values of the map are already the resulting distinct friends. If you really need a List
, you can create it with a plain Collection operation:
List<Friend> mergedFriends = new ArrayList<>(uniqueFriendMap.values());
If this second operation still annoys you, you can hide it within the collect
operation:
List<Friend> mergedFriends = friends.stream()
.collect(Collectors.collectingAndThen(
Collectors.groupingBy(Friend::uniqueFunction, Collectors.collectingAndThen(
Collectors.reducing((a,b) -> friendMergeFunction(a,b)), Optional::get)),
m -> new ArrayList<>(m.values())));
Since the nested collector represents a Reduction (see also this answer), we can use toMap
instead:
List<Friend> mergedFriends = friends.stream()
.collect(Collectors.collectingAndThen(
Collectors.toMap(Friend::uniqueFunction, Function.identity(),
(a,b) -> friendMergeFunction(a,b)),
m -> new ArrayList<>(m.values())));
Depending on whether friendMergeFunction
is a static
method or instance method, you may replace (a,b) -> friendMergeFunction(a,b)
with DeclaringClass::friendMergeFunction
or this::friendMergeFunction
.
But note that even within your original approach, several simplifications are possible. When you only process the values of a Map
, you don’t need to use the entrySet()
, which requires you to call getValue()
on each entry. You can process the values()
in the first place. Then, you don’t need the verbose input -> { return expression; }
syntax, as input -> expression
is sufficient. Since the groups of the preceding grouping operation can not be empty, the filter step is obsolete. So your original approach would look like:
Map<String, List<Friend>> uniqueFriendMap
= friends.stream().collect(Collectors.groupingBy(Friend::uniqueFunction));
List<Friend> mergedFriends = uniqueFriendMap.values().stream()
.map(group -> group.stream().reduce((a,b) -> friendMergeFunction(a,b)).get())
.collect(Collectors.toList());
which is not so bad. As said, the fused operation doesn’t skip the Map
creation as that’s unavoidable. It only skips the creations of the List
s representing each group, as it will reduce them to a single Friend
in-place.