Improve running speed for DeleteDuplicates
Here is a semi-imperative way that gives the same result as tup2
from the original question, but much faster:
tup2b = Module[{keep}
, keep[t_] :=
( keep[t] = False
; keep[t[[{4,5,6,1,2,3,10,11,12,7,8,9}]]] = False
; True
)
; Select[tup1, keep]
];
tup2b === tup2
(* True *)
Length[tup2b] === Length[tup2] === 52650
(* True *)
On my machine, the calculation of tup2
takes a couple of hours whereas the approach shown here is subsecond. This approach also makes it easy to add other equivalence criteria if desired.
How It Works
The function keep
is used as a predicate to Select
to determine whether to keep each entry of the list. The first time each element is encountered, keep
returns True
. But as a side-effect it also adds a new definition to keep
that records that the entry and its equivalent permutation are no longer to be kept. The new definition will return False
if either entry is encountered later in the list. In this way keep
effectively maintains a set of entries seen so far (along with their equivalents).
For the example list given in the question, it is not strictly necessary to record that we have seen each unpermuted entry since there are no duplicates. But in the general case that might not be so.
This method scans each entry of the list once, so it runs in time roughly proportional to the length $n$ of the list. Technically, the time is more on the order of $n \ln(n)$ due to the hashing involved in saving and testing entries, but for small $n$ the difference is not that noticeable.
By contrast, and as Leonid Shifrin points out, DeleteDuplicates
must in principle compare every pair of elements, so the run time is proportional to $n^2$ -- a much larger number of iterations.
You can also get tup2
from tup1
using:
1. Union
ClearAll[fA]
fA = Union[Sort[{#, #[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]}] & /@ #][[All, 1]] &;
tup2A = fA @ tup1; // AbsoluteTiming // First
0.213222
Length @ tup2A
52650
2. DeleteDuplicates
ClearAll[fB]
fB = DeleteDuplicates[
Sort[{#, #[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]}] & /@ #][[All, 1]] &;
tup2B = fB @ tup1; // AbsoluteTiming // First
0.257217
3. GroupOrbits
+ PermutationGroup
ClearAll[fC]
fC = GroupOrbits[PermutationGroup[{{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}}], #,
Permute][[All, 1]] &
tup2C = fC @ tup1; // AbsoluteTiming // First
0.640413
4. Memoization
ClearAll[fD]
fD = Module[{f0},
f0[x_] := (f0[x] = f0[x[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]] = Sequence[]; x);
f0 /@ #] &;
tup2D = fD @ tup1; // AbsoluteTiming // First
0.794055
5. DeleteDuplicatesBy
ClearAll[fE]
fE = DeleteDuplicatesBy[Sort[{#, #[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]}]&]
tup2E = fE @ tup1; // AbsoluteTiming // First
1.13389
6. GroupBy
ClearAll[fF]
fF = Values @ GroupBy[#,
Sort[{#, #[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]}] &,
First] &;
tup2F = fF @ tup1; // AbsoluteTiming // First
1.28655
All six results match tup2b
from WReach's answer:
tup2b == tup2A == tup2B == tup2C == tup2D == tup2E == tup2F
True
In comparison, tup2b
takes about a second:
tup2b = Module[{keep},
keep[t_] := (keep[t] = False;
keep[t[[{4, 5, 6, 1, 2, 3, 10, 11, 12, 7, 8, 9}]]] = False;
True); Select[tup1, keep]]; // AbsoluteTiming // First
1.06063
A tuple is deleted if both its first two triples and its second two triples are identical when exchanged (but not when they're the same when not swapped). That we can construct the undeleted tuples directly like so:
tup2 =
Module[{
pair = Tuples[Tuples[{{0, 1}, {0, -1, 1}, {0, -1, 1}}], 2],
unique
},
unique = DeleteDuplicates[pair, #1 === Reverse@#2 &];
Flatten /@
DeleteDuplicates@
Join[Tuples[{unique, pair}], Tuples[{pair, unique}]]];
Takes about a tenth of a second on my laptop.
UPDATE: If you want to use the existing tup1
and delete the duplicates, you can keep the ones that appear in tup2
like so:
tup3 = Intersection[tup1, tup2];
which is very fast. If for some reason you need to keep tup1
in the original order, you can do something a bit slower:
tup4 = Select[tup1, AssociationThread[tup2 -> True]];
This takes another 0.2 seconds on my laptop.
Given the way the problem is stated, it's much easier to construct tuples you know are unique than to delete duplicates after the fact.