Group and average NumPy matrix
A compact solution is to use numpy_indexed (disclaimer: I am its author), which implements a fully vectorized solution:
import numpy_indexed as npi
npi.group_by(arr[:, 2]).mean(arr)
You can do:
for x in sorted(np.unique(arr[...,2])):
results.append([np.average(arr[np.where(arr[...,2]==x)][...,0]),
np.average(arr[np.where(arr[...,2]==x)][...,1]),
x])
Testing:
>>> arr
array([[ 6., 12., 1.],
[ 7., 9., 1.],
[ 8., 7., 1.],
[ 4., 3., 2.],
[ 6., 1., 2.],
[ 2., 5., 2.],
[ 9., 4., 3.],
[ 2., 1., 4.],
[ 8., 4., 4.],
[ 3., 5., 4.]])
>>> results=[]
>>> for x in sorted(np.unique(arr[...,2])):
... results.append([np.average(arr[np.where(arr[...,2]==x)][...,0]),
... np.average(arr[np.where(arr[...,2]==x)][...,1]),
... x])
...
>>> results
[[7.0, 9.3333333333333339, 1.0], [4.0, 3.0, 2.0], [9.0, 4.0, 3.0], [4.333333333333333, 3.3333333333333335, 4.0]]
The array arr
does not need to be sorted, and all the intermediate arrays are views (ie, not new arrays of data). The average is calculated efficiently directly from those views.
solution
from itertools import groupby
from operator import itemgetter
arr = [[6.0, 12.0, 1.0],
[7.0, 9.0, 1.0],
[8.0, 7.0, 1.0],
[4.0, 3.0, 2.0],
[6.0, 1.0, 2.0],
[2.0, 5.0, 2.0],
[9.0, 4.0, 3.0],
[2.0, 1.0, 4.0],
[8.0, 4.0, 4.0],
[3.0, 5.0, 4.0]]
result = []
for groupByID, rows in groupby(arr, key=itemgetter(2)):
position1, position2, counter = 0, 0, 0
for row in rows:
position1+=row[0]
position2+=row[1]
counter+=1
result.append([position1/counter, position2/counter, groupByID])
print(result)
would output:
[[7.0, 9.333333333333334, 1.0]]
[[4.0, 3.0, 2.0]]
[[9.0, 4.0, 3.0]]
[[4.333333333333333, 3.3333333333333335, 4.0]]