Shuffle file randomly with some additional constraints

Your example data and constraints actually only allow a few solutions—you must play John B. every other song, for example. I'm going to assume your actual full playlist isn't essentially John B, with random other stuff to break it up.

This is another random approach. Unlike @frostschutz's solution, it runs quickly. It does not guarantee a result that matches your criteria, however. I also present a second approach, which works on your example data—but I suspect will produce bad results on your real data. Having your real data (obfuscated), I add approach 3—which is a uniform random, except it avoids two songs by the same artist in a row. Note that it only makes 5 "draws" into the "deck" of remaining songs, if after that it still is faced with a duplicate artist, it'll output that song anyway—this way, its guaranteed that the program will actually finish.

Approach 1

Basically, it generates a playlist by at each point, asking "which artists do I still have unplayed songs from?" Then picking a random artist, and finally a random song from that artist. (That is, each artist is weighted equally, not in proportion to the number of songs.)

Give it a try on your actual playlist, and see if it produces better results than uniformly random.

Usage: ./script-file < input.m3u > output.m3u Make sure to chmod +x it, of course. Note it doesn't handle the signature line that is at the top of some M3U files properly... but your example didn't have that.

#!/usr/bin/perl
use warnings qw(all);
use strict;

use List::Util qw(shuffle);

# split the input playlist by artist
my %by_artist;
while (defined(my $line = <>)) {
    my $artist = ($line =~ /^(.+?) - /)
        ? $1
        : 'UNKNOWN';
    push @{$by_artist{$artist}}, $line;
}

# sort each artist's songs randomly
foreach my $l (values %by_artist) {
    @$l = shuffle @$l;
}

# pick a random artist, spit out their "last" (remeber: in random order)
# song, remove from the list. If empty, remove artist. Repeat until no
# artists left.
while (%by_artist) {
    my @a_avail = keys %by_artist;
    my $a = $a_avail[int rand @a_avail];
    my $songs = $by_artist{$a};
    print pop @$songs;
    @$songs or delete $by_artist{$a};
}

Approach 2

As a second approach, instead of pick a random artist, you can use pick the artist with the most songs, who is also not the last artist we picked. The final paragraph of the program then becomes:

# pick the artist with the most songs who isn't the last artist, spit
# out their "last" (remeber: in random order) song, remove from the
# list. If empty, remove artist. Repeat until no artists left.
my $last_a;
while (%by_artist) {
    my %counts = map { $_, scalar(@{$by_artist{$_}}) } keys %by_artist;
    my @sorted = sort { $counts{$b} <=> $counts{$a} } shuffle keys %by_artist;
    my $a = (1 == @sorted)
        ? $sorted[0]
        : (defined $last_a && $last_a eq $sorted[0])
            ? $sorted[1]
            : $sorted[0];
    $last_a = $a;
    my $songs = $by_artist{$a};
    print pop @$songs;
    @$songs or delete $by_artist{$a};
}

The rest of the program stays the same. Note that this by far isn't the most efficient way to do this, but it should be fast enough for playlists of any sane size. With your example data, all generated playlists will start with a John B. song, then an Anna A. song, then a John B. song. After that, it's much less predictable (as everyone but John B. has one song left). Note that this assumes Perl 5.7 or later.

Approach 3

Usage is the same as the previous 2. Note the 0..4 part, that's where the 5 tries max comes from. You could up the number of tries, e.g., 0..9 would give 10 total. (0..4 = 0, 1, 2, 3, 4, which you'll notice is actually 5 items).

#!/usr/bin/perl
use warnings qw(all);
use strict;

# read in playlist
my @songs = <>;

# Pick one randomly. Check if its the same artist as the previous song.
# If it is, try another random one. Try again 4 times (5 total). If its
# still the same, accept it anyway.
my $last_artist;
while (@songs) {
    my ($song_idx, $artist);
    for (0..4) {
        $song_idx = int rand @songs;
        $songs[$song_idx] =~ /^(.+?) - /;
        $artist = $1;
        last unless defined $last_artist;
        last unless defined $artist; # assume unknown are all different
        last if $last_artist ne $artist;
    }

    $last_artist = $artist;
    print splice(@songs, $song_idx, 1);
}

If I had to apply that shuffling to a deck of playing card, I think I'd first shuffle the deck, then display the cards in a row before my eyes and processing from left to right, wherever there are adjacent clubs or heart... move all but one of those at random somewhere else (though not next to another one of the same type).

For example, with a hand like

               

After basic shuffling:

     <  >< >   
                   1  2       3

two groups of adjacent spades, we need to relocate 1, 2 and 3. For 1, the choices are:

               
    ↑        ↑                    ↑        ↑

We pick one at random from those 4. Then we repeat the process for 2 and 3.

Implemented in perl that would be:

shuf list | perl -e '
  @songs = map {/(.*?)-/; [$1,$_]} <>;
  for ($i = 0; $i < @songs; $i++) {
    if (($author = $songs[$i]->[0]) eq $previous) {
      my @reloc_candidates, $same;
      for($j = 0; $j < @songs; $j++) {
        # build a list of positions where we could move that song to
        if ($songs[$j]->[0] eq $author) {$same = 1} else {
          push @reloc_candidates, $j unless $same;
          $same = 0;
        }
      }
      push @reloc_candidates, $j unless $same;

      if (@reloc_candidates) {
        # now pick one of them at random:
        my $chosen = $reloc_candidates[int(rand(@reloc_candidates))];
        splice @songs, $chosen - ($chosen > $i), 0, splice @songs, $i, 1;
        $i -= $chosen > $i;
      }
    }
    $previous = $author;
  }
  print map {$_->[1]} @songs'

It will find a solution with non-adjacent artists if it exists (unless more than half the songs are from the same artist), and should be uniform AFAICT.


If you don't mind it being horribly inefficient...

while [ 1 ]
do
    R="`shuf playlist`"
    D="`echo "$R" | sed -e 's/ - .*//' | uniq -c -d`"
    if [ "$D" == "" ]
    then
        break
    #else # DEBUG ONLY:
    #    echo --- FAIL: ---
    #    echo "$D"
    #    echo -------------
    fi
done

echo "$R"

It just keeps rolling and rolling until it comes upon a result that does not have two or more Johns in a row. If there are so many Johns in your playlist that such a combination does not exist or extremely unlikely to be rolled, well, it will hang.

Example result with your input:

John B. - Song 4
Kyle C. - Song 1
Anna A. - Song 2
John B. - Song 3
Anna A. - Song 1
John B. - Song 1
U--Rock - Song 1
John B. - Song 2
I--Rock - Song 1
John B. - Song 5

If you uncomment the debug lines, it will tell you why it failed:

--- FAIL: ---
      3 John B.
-------------
--- FAIL: ---
      2 John B.
      2 John B.
-------------

That should help determine the cause in case it hangs indefinitely.