How to efficiently find positions of duplicates?
You can use GatherBy
for this. You can map List
onto Range[...]
first if you wish to have exactly the same output you showed.
positionDuplicates[list_] := GatherBy[Range@Length[list], list[[#]] &]
list = {3, 3, 6, 11, 13, 13, 11, 1, 2, 3, 12, 8, 9, 9, 4, 15, 5, 6, 9, 12}
positionDuplicates[list]
(* ==> {{1, 2, 10}, {3, 18}, {4, 7}, {5, 6}, {8}, {9},
{11, 20}, {12}, {13, 14, 19}, {15}, {16}, {17}} *)
If you prefer a Sow
/Reap
solution, I think this is simpler than your version (but slower than GatherBy
):
positionDuplicates[list_] := Last@Reap[MapIndexed[Sow[#2, #1] &, list]]
If you need to remove the positions of non-duplicates, I'd suggest doing that as a post processing step, e.g. Select[result, Length[#] > 1&]
Here is a version based on sorting, and using Mr. Wizard's dynP function:
dynP[l_, p_] :=
MapThread[l[[# ;; #2]] &, {{0}~Join~Most@# + 1, #} &@Accumulate@p]
positionOfDuplicates[list_] :=
With[{ord = Ordering[list]},
SortBy[dynP[ord, Length /@ Split[list[[ord]]]], First]
]
so that
positionOfDuplicates[list]
(* {{1,2,10},{3,18},{4,7},{5,6},{8},{9},{11,20},{12},{13,14,19},{15},{16},{17}} *)
It is also fast enough, although not as fast as the one based on GatherBy
.
If you wanted to retain each value as well as its positions, this works.
Sort[
Map[{#[[1, 1]], Flatten[#[[All, 2]]]} &,
Reap[MapIndexed[Sow[{#1, #2}, #1] &, list]][[2, All, All]]]]
(* Out[178]= {{0, {14}}, {1, {17, 19}}, {4, {4,
20}}, {5, {12}}, {7, {10}}, {9, {13}}, {10, {2,
6}}, {11, {3}}, {12, {7, 15}}, {13, {8, 9, 11}}, {14, {1, 16,
18}}, {15, {5}}} *)
It's maybe 20x slower than the GatherBy
though.