Shuffle product of two lists

Recursion

This is top-level code and therefore unlikely to be as efficient as a compiled solution, but the recursive algorithm should have reasonable computational complexity:

f[u : {a_, x___}, v : {b_, y___}, c___] := f[{x}, v, c, a] ~Join~ f[u, {y}, c, b]

f[{x___}, {y___}, c___] := {{c, x, y}}  (* rule for empty-set termination *)

Now:

f[{a, b}, {c, d}]
{{a, b, c, d}, {a, c, b, d}, {a, c, d, b}, {c, a, b, d}, {c, a, d, b}, {c, d, a, b}}

Proof of result

A question was raised regarding the validity of the result of this function. You can graphically demonstrate that this function yields the correct result (as I understand it) for length 2,4 lists with this:

ArrayPlot @ f[{Pink, Red}, {1, 2, 3, 4}]

enter image description here

  • Pink always precedes Red
  • The gray values are always in order
  • No shuffles are missing
  • No shuffles are duplicated

I must conclude that the referenced paper is in error, or the original question diverges from the definition of "shuffle product" therein.


Extension to multiple lists

While it is possible to extend a two-list shuffle product to multiple lists using Fold it may be of interest to do it directly. This code is slow however and should not be used where performance matters.

f2[in_, out___] :=
 Join @@ ReplaceList[in, {x___, {a_, b___}, y___} :> f2[{x, {b}, y}, out, a]]

f2[{{} ..}, out__] := {{out}}

f2[{{1, 2}, {Cyan}, {LightRed, Pink, Red}}] // Transpose // ArrayPlot

enter image description here


Alternative method

Edit: I had a function g which used Permutations on a binary list to generate all the base orderings, but I did not fill these orderings efficiently. rasher used the same start but came up with a clever and fast way to fill those orderings. I am replacing this section of my answer with a refactored version of his code; credit to him for making the Permutations approach competitive. (If you are interested in my original, slow g see the edit history.)

Here is a refactoring of rasher's shufflem function; I shall call mine shuffleW.

Edit 2017: Now significantly faster after moving or eliminating several Transpose operations.

shuffleW[s1_, s2_] := 
  Module[{p, tp, ord},
    p = Permutations @ Join[1 & /@ s1, 0 & /@ s2]\[Transpose];
    tp = BitXor[p, 1];
    ord = Accumulate[p] p + (Accumulate[tp] + Length[s1]) tp;
    Outer[Part, {Join[s1, s2]}, ord, 1][[1]]\[Transpose]
  ]

shuffleW[{a, b}, {c, d}]
{{a, b, c, d}, {a, c, b, d}, {a, c, d, b}, {c, a, b, d}, {c, a, d, b}, {c, d, a, b}}

Timings

Timings of these functions as well as Simon's shuffles and rasher's shufflem, performed in v10.1.

I am also including rasher's unnamed but very fast penultimate Ordering method. I shall call it shuffleO and format it for readability.

shuffleO[s1_, s2_] :=
  With[{all = Join[s1, s2]},
    Join[1 & /@ s1, 0 & /@ s2]
      // Permutations
      // Map[Ordering @ Ordering @ # &]
      // Flatten
      // all[[#]] &
      // Partition[#, Length[all]] &
  ]

s1 = CharacterRange["a", "k"];
s2 = Range @ 11;

f[s1, s2]        // Length // RepeatedTiming
shuffles[s1, s2] // Length // RepeatedTiming
shufflem[s1, s2] // Length // RepeatedTiming
shuffleO[s1, s2] // Length // RepeatedTiming
shuffleW[s1, s2] // Length // RepeatedTiming
{2.6523, 705432}

{2.0618, 705432}

{1.455, 705432}

{0.78, 705432}

{0.759, 705432}

rasher's methods are the fastest in this test. I find my f the most readable as the mechanism of its action is directly visible. Take your pick. :-)

Disparate list lengths

Yi Wang posted a nice clean method that has some interesting properties. Specifically it is the best performing solution so far in the case of input lists of significantly disparate length, but they must be fed in the correct order or the performance is magnitudes worse. First in the symmetric test above:

Fold[insertElem, {{s2, 0}}, s1][[All, 1]] // Length // RepeatedTiming
{2.67, 705432}

Now with disparate lists:

s1 = Range[80];
s2 = {"a", "b", "c"};

Fold[insertElem, {{s2, 0}}, s1][[All, 1]] // Length // RepeatedTiming
{6.25, 91881}

Swap the lists and try again (so that insertElem is Folded over the shorter list):

{s2, s1} = {s1, s2};

Fold[insertElem, {{s2, 0}}, s1][[All, 1]] // Length // RepeatedTiming
{0.121, 91881}

This is significantly faster than all the others (order makes little difference to these):

f[s1, s2]        // Length // RepeatedTiming
shuffles[s1, s2] // Length // RepeatedTiming
shufflem[s1, s2] // Length // RepeatedTiming
shuffleO[s1, s2] // Length // RepeatedTiming
shuffleW[s1, s2] // Length // RepeatedTiming
{0.529, 91881}

{0.5637, 91881}

{0.562, 91881}

{0.4135, 91881}

{0.4356, 91881}

Probably not too bad performance-wise, haven't tested though:

x = {a, b};
y = {c, d};

n = Length[x] + Length[y];
px = Subsets[Range[n], {Length[x]}];
py = Reverse[Subsets[Range[n], {Length[y]}]];

Normal /@ MapThread[SparseArray[{#1 -> x, #2 -> y}] &, {px, py}]

(* {{a, b, c, d}, {a, c, b, d}, {a, c, d, b}, {c, a, b, d}, {c, a, d, b}, {c, d, a, b}} *)

Update

This has better performance:

shuffles[x_, y_] := Module[{n, px, py, z, xy},
    n = Length /@ {x, y};
    {px, py} = Subsets[Range @ Tr @ n, {#}] & /@ n;
    py = Reverse @ py;
    xy = z = Join[x, y];
    (z[[#]] = xy; z) & /@ Join[px, py, 2]
  ]

shuffles[{a, b}, {c, d}]
(* {{a, b, c, d}, {a, c, b, d}, {a, c, d, b}, {c, a, b, d}, {c, a, d, b}, {c, d, a, b}} *)

Tweaking other people's code

This is a refactoring of Mr Wizard's refactoring of rasher's code. The primary changes are to Flatten the ordering matrix so that the whole output is extracted with a single Part call (and subsequently partitioned), and to reduce the number of Transpose operations. On my PC this gives about a 30% speed gain over Mr Wizard's version:

shuffleSWR[s1_, s2_] := Module[{p, tp, accf, ord, ss},
  p = Transpose @ Permutations @ Join[1 & /@ s1, 0 & /@ s2];
  tp = BitXor[p, 1];
  ord = Accumulate[p] p + (Accumulate[tp] + Length[s1]) tp;
  ss = Join[s1, s2];
  ss[[Flatten @ Transpose @ ord]] ~Partition~ Length[ss]]

Something different, impractical yet amusing, with a hint of mathematical insouciance...

s1 = {a, b, c}
s2 = {f, g, h}

sall = Join[s1, s2];
ln = Length[sall];

s = Solve[
   Max[sall] == ln && Min[sall] == 1 && Less[Sequence @@ s1] && 
    Less[Sequence @@ s2] && Unequal[Sequence @@ sall], sall, Integers];

Range@ln /. MapAt[Reverse, s, {All, All}]

And, for my actual entry into the horse race, I offer the following. Seems to beat the current record holder pretty handily in my limited tests.

shufflem[s1_, s2_] := Module[{ss, sr, s1l, s2l, p, pp, tp, pres},

  {ss, sr} = {Join[s1, s2], Range@((s1l = Length@s1) + (s2l = Length@s2))};

  p = Permutations[Join[ConstantArray[1, s1l], ConstantArray[0, s2l]]];

  pp = (Accumulate@Transpose[p] // Transpose) p;

  tp = Clip[p, {1, 0}, {1, 0}];

  pres = pp + 
    Clip[(Accumulate@Transpose[tp] // Transpose) tp + s1l, {s1l + 1, Max[sr]}, {0, 0}];

  ss[[#]] & /@ pres
  ]

As requested by Mr. Wizard, an explanation for readers.

Pondering this over dinner, I imagined it as one list "bubbling" through the other. The pattern that came to mind was obvious: a binary counter kind of thing, where the 1 and 0 represent positions in the shuffled list.

Having distinct values makes the permutation operation reasonable, i.e., just naively permuting the whole "deck" of the lists and the filtering results in mostly wasted work, since most will have elements violating the order rule, while having only representations of some element in some list means we'll never see such transpositions in the permutation.

With that in mind, I build a list of possible positions in the shuffled list (the range of total length in sr), along with making a joined version of the lists (ss).

I then build (in p) a list with 1 and 0, in quantities matching the lengths of each original list, and generate the permutations. So now I have a big list of all possible permutations of the final shuffled list, with 1's where an element from the first list goes, and a 0 for those from the second.

We want the elements to fill these 1/0 slots to be in the order of the originals, that is, element 1,2,3... for each.

So I accumulate over each permutation (I double transpose for performance), leaving me with say something like {1,0,1,1,0,0} going to {1,1,2,3,3,3}. I'm only interested in the items from the first list on the first pass, so I multiply the accumulants by their respective permutation, e.g., in the current example we'd get {1,0,2,3,0,0} for that one permutation, meaning from the first list, slots 1,3, and 4 get filled from the list elements 1,2 and 3 of the joined list.

The second pass fills the 0 spots from the second list, by doing the same thing with some hand-waving to get the values into the second half of the joined list: I clip the permutations to turn 1->0 and 0->1, do the same machination as the last, but I add an offset (the length of the first list prefix in the joined list). I clip out noise values, ending up with the list of permutations with elements looking like {1,4,2,3,5,6}, etc.

Finally, I map over that result of permutation lists indexing them into the joined list, so for example, if the joined list were {a,b,c,d,e,f}, the example permutation element would result in {a,d,b,c,e,f} being pulled.

Vectorizing the work makes it snappy - the vast majority of the time overall is in mapping over the work to get the final result.

Lastly, a surprisingly quick (I thought it would be much slower than it turns out to be) method that avoids any mathematical manipulation:

Module[{jl = Flatten[{#1, #2}]},
   Partition[
    jl[[Flatten[
       Ordering /@ 
        Ordering /@ 
         Permutations[
          Join[ConstantArray[1, Length[#1]], 
           ConstantArray[0, Length[#2]]]]]]], Length[jl]]] &[s1, s2]

And this use of ordering is even faster due to the relatively small size of input lists, turns out mapping over them is quicker, resulting in overall time that seems competitive with the refactored*refactored version of my original. Interesting:

Partition[ Join[#1, #2][[Flatten[ Ordering[Ordering[#]] & /@ 
       Permutations[Flatten[{1 & /@ #1, 0 & /@ #2}]]]]],
   Length[#1] + Length[#2]] &[s1, s2]

Finally, the above generalized to allow more lists (timings on a netbook that was near bursting into flames):

lists = {{a, b, c, d}, {e, f, g, h, i}, {j, k, l, m, n, o}}

res2 = Partition[ Flatten[##][[Flatten[ Ordering[Ordering[#]] & /@ 
          Permutations[Flatten[MapIndexed[(idx = #2[[1]]; idx & /@ #1) &, ##]]]]]],
      Length[Flatten[##]]] &[lists]; // Timing

Length[res2]

res2 // Short

(* {4.539629, Null} *)

(* 630630 *)

(* {{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o},{a,b,c,d,e,f,g,h,j,i,k,l,m,n,o},<<630627>>,{j,k,l,m,n,o,e,f,g,h,i,a,b,c,d}} *)

Any of the methods become pretty useless with large lists and/or large number of lists:

enter image description here

e.g, two lists of 40 elements, were one able to generate permutations that adhere to the ordering rule at a rate of 1,000,0000 per second would far exceed the age of the universe before completing.

The following generates a random permutation that adheres to the ordering rules for such cases:

randomOrderedShuffle[lists__] := 
 Join[lists][[Ordering@Ordering@RandomSample@(Join @@ ConstantArray @@@ 
         Transpose[{Range@Length@{lists}, Length /@ {lists}}])]]

(* The following example has 3,644,153,415,887,633,116,359,073,848,179,365,185,734,400  "valid" permutations... *)

randomOrderedShuffle[Range[10], Range[11, 20], Range[21, 30], Range[31, 40], Range[41, 50], Range[51, 60]] // Timing

(*  {0., {1, 21, 41, 11, 22, 12, 31, 23, 51, 2, 3, 52, 42, 13, 14, 53, 43, 15, 32, 4, 54, 16, 33, 17, 34, 55, 5, 44, 45, 56, 6, 35, 57, 36, 7, 46, 8, 37, 58, 24, 59, 47, 25, 9, 10, 60, 26, 48, 18, 27, 19, 38, 49, 28, 39, 29, 40, 30, 20, 50}}  *)