Shuffling NumPy array along a given axis
Vectorized solution with rand+argsort
trick
We could generate unique indices along the specified axis and index into the the input array with advanced-indexing
. To generate the unique indices, we would use random float generation + sort
trick, thus giving us a vectorized solution. We would also generalize it to cover generic n-dim
arrays and along generic axes
with np.take_along_axis
. The final implementation would look something like this -
def shuffle_along_axis(a, axis):
idx = np.random.rand(*a.shape).argsort(axis=axis)
return np.take_along_axis(a,idx,axis=axis)
Note that this shuffle won't be in-place and returns a shuffled copy.
Sample run -
In [33]: a
Out[33]:
array([[18, 95, 45, 33],
[40, 78, 31, 52],
[75, 49, 42, 94]])
In [34]: shuffle_along_axis(a, axis=0)
Out[34]:
array([[75, 78, 42, 94],
[40, 49, 45, 52],
[18, 95, 31, 33]])
In [35]: shuffle_along_axis(a, axis=1)
Out[35]:
array([[45, 18, 33, 95],
[31, 78, 52, 40],
[42, 75, 94, 49]])
You have to call numpy.random.shuffle()
several times because you are shuffling several sequences independently. numpy.random.shuffle()
works on any mutable sequence and is not actually a ufunc
. The shortest and most efficient code to shuffle all rows of a two-dimensional array a
separately probably is
list(map(numpy.random.shuffle, a))
Some people prefer to write this as a list comprehension instead:
[numpy.random.shuffle(x) for x in a]