How to generate all the permutations of a multiset?
This is my translation of the Takaoka multiset permutations algorithm into Python (available here and at repl.it):
def msp(items):
'''Yield the permutations of `items` where items is either a list
of integers representing the actual items or a list of hashable items.
The output are the unique permutations of the items given as a list
of integers 0, ..., n-1 that represent the n unique elements in
`items`.
Examples
========
>>> for i in msp('xoxox'):
... print(i)
[1, 1, 1, 0, 0]
[0, 1, 1, 1, 0]
[1, 0, 1, 1, 0]
[1, 1, 0, 1, 0]
[0, 1, 1, 0, 1]
[1, 0, 1, 0, 1]
[0, 1, 0, 1, 1]
[0, 0, 1, 1, 1]
[1, 0, 0, 1, 1]
[1, 1, 0, 0, 1]
Reference: "An O(1) Time Algorithm for Generating Multiset Permutations", Tadao Takaoka
https://pdfs.semanticscholar.org/83b2/6f222e8648a7a0599309a40af21837a0264b.pdf
'''
def visit(head):
(rv, j) = ([], head)
for i in range(N):
(dat, j) = E[j]
rv.append(dat)
return rv
u = list(set(items))
E = list(reversed(sorted([u.index(i) for i in items])))
N = len(E)
# put E into linked-list format
(val, nxt) = (0, 1)
for i in range(N):
E[i] = [E[i], i + 1]
E[-1][nxt] = None
head = 0
afteri = N - 1
i = afteri - 1
yield visit(head)
while E[afteri][nxt] is not None or E[afteri][val] < E[head][val]:
j = E[afteri][nxt] # added to algorithm for clarity
if j is not None and E[i][val] >= E[j][val]:
beforek = afteri
else:
beforek = i
k = E[beforek][nxt]
E[beforek][nxt] = E[k][nxt]
E[k][nxt] = head
if E[k][val] < E[head][val]:
i = k
afteri = E[i][nxt]
head = k
yield visit(head)
There are O(1) (per permutation) algorithms for multiset permutation generation, for example, from Takaoka (with implementation)
Generating all the possible permutations and then discarding the repeated ones is highly inefficient. Various algorithms exist to directly generate the permutations of a multiset in lexicographical order or other kind of ordering. Takaoka's algorithm is a good example, but probably that of Aaron Williams is better
http://webhome.csc.uvic.ca/~haron/CoolMulti.pdf
moreover, it has been implemented in the R package ''multicool''.
Btw, if you just want the total number of distinct permutations, the answer is the Multinomial coefficient: e.g., if you have, say, n_a elements 'a', n_b elements 'b', n_c elements 'c', the total number of distinct permutations is (n_a+n_b+n_c)!/(n_a!n_b!n_c!)