How can I "zip sort" parallel numpy arrays?
Like @Peter Hansen's answer, this makes a copy of the arrays before it sorts them. But it is simple, does the main sort in-place, uses the second array for auxiliary sorting, and should be very fast:
a = np.array([2, 3, 1])
b = np.array([4, 6, 2])
# combine, sort and break apart
a, b = np.sort(np.array([a, b]))
Update: The code above doesn't actually work, as pointed out in a comment. Below is some better code. This should be fairly efficient—e.g., it avoids explicitly making extra copies of the arrays. It's hard to say how efficient it will be, because the documentation doesn't give any details on the numpy.lexsort
algorithm. But it should work pretty well, since this is exactly the job lexsort
was written for.
a = np.array([5, 3, 1])
b = np.array([4, 6, 7])
new_order = np.lexsort([b, a])
a = a[new_order]
b = b[new_order]
print(a, b)
# (array([1, 3, 5]), array([7, 6, 4]))
Here's an approach that creates no intermediate Python lists, though it does require a NumPy "record array" to use for the sorting. If your two input arrays are actually related (like columns in a spreadsheet) then this might open up an advantageous way of dealing with your data in general, rather than keeping two distinct arrays around all the time, in which case you'd already have a record array and your original problem would be answered merely by calling sort() on your array.
This does an in-place sort after packing both arrays into a record array:
>>> from numpy import array, rec
>>> a = array([2, 3, 1])
>>> b = array([4, 6, 7])
>>> c = rec.fromarrays([a, b])
>>> c.sort()
>>> c.f1 # fromarrays adds field names beginning with f0 automatically
array([7, 4, 6])
Edited to use rec.fromarrays() for simplicity, skip redundant dtype, use default sort key, use default field names instead of specifying (based on this example).
b[a.argsort()]
should do the trick.
Here's how it works. First you need to find a permutation that sorts a. argsort
is a method that computes this:
>>> a = numpy.array([2, 3, 1])
>>> p = a.argsort()
>>> p
[2, 0, 1]
You can easily check that this is right:
>>> a[p]
array([1, 2, 3])
Now apply the same permutation to b.
>>> b = numpy.array([4, 6, 7])
>>> b[p]
array([7, 4, 6])