Most efficient way of adding elements given the index list in numpy
I doubt you can get much faster than np.bincount
- and notice how the official documentation provides this exact usecase
# Your example
A = [0.5, 0.6]
D = [[0.1, 0.1, 0.2], [0.2, 0.4, 0.1]]
I = [[0, 1, 0], [0, 1, 1]]
# Solution
import numpy as np
D, I = np.array(D).flatten(), np.array(I).flatten()
print(np.bincount(I, D)) #[0.5 0.6]
The shape of I
and D
doesn't matter: you can clearly ravel the arrays without changing the outcome:
index = np.ravel(I)
data = np.ravel(D)
Now you can sort both arrays according to I
:
sorter = np.argsort(index)
index = index[sorter]
data = data[sorter]
This is helpful because now index
looks like this:
0, 0, 0, 1, 1, 1
And data
is this:
0.1, 0.2, 0.2, 0.1, 0.4, 0.1
Adding together runs of consecutive numbers should be easier than processing random locations. Let's start by finding the indices where the runs start:
runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1]
Now you can use the fact that ufuncs like np.add
have a partial reduce
operation called reduceat
. This allows you to sum regions of an array:
a = np.add.reduceat(data, runs)
If I
is guaranteed to contain all indices in [0, A.size
) at least once, you're done: just assign to A
instead of a
. If not, you can make the mapping using the fact that the start of each run in index
is the target index:
A = np.zeros(n)
A[index[runs]] = a
Algorithmic complexity analysis:
ravel
is O(1) in time and space if the data is in an array. If it's a list, this is O(MN) in time and spaceargsort
is O(MN log MN) in time andO(MN)
in space- Indexing by
sorter
is O(MN) in time and space - Computing
runs
is O(MN) in time and O(MN + M) = O(MN) in space reduceat
is a single pass: O(MN) in time, O(M) in space- Reassigning
A
is O(M) in time and space
Total: O(MN log MN) time, O(MN) space
TL;DR
def make_A(D, I, M):
index = np.ravel(I)
data = np.ravel(D)
sorter = np.argsort(index)
index = index[sorter]
if index[0] < 0 or index[-1] >= M:
raise ValueError('Bad indices')
data = data[sorter]
runs = np.r_[0, np.flatnonzero(np.diff(index)) + 1]
a = np.add.reduceat(data, runs)
if a.size == M:
return a
A = np.zeros(M)
A[index[runs]] = a
return A