averaging list of lists python column-wise

Pure Python:

from __future__ import division
def mean(a):
    return sum(a) / len(a)
a = [[240, 240, 239],
     [250, 249, 237], 
     [242, 239, 237],
     [240, 234, 233]]
print map(mean, zip(*a))

printing

[243.0, 240.5, 236.5]

NumPy:

a = numpy.array([[240, 240, 239],
                 [250, 249, 237], 
                 [242, 239, 237],
                 [240, 234, 233]])
print numpy.mean(a, axis=0)

Python 3:

from statistics import mean
a = [[240, 240, 239],
     [250, 249, 237], 
     [242, 239, 237],
     [240, 234, 233]]
print(*map(mean, zip(*a)))

Use zip(), like so:

averages = [sum(col) / float(len(col)) for col in zip(*data)]

zip() takes multiple iterable arguments, and returns slices of those iterables (as tuples), until one of the iterables cannot return anything more. In effect, it performs a transpose operation, akin to matrices.

>>> data = [[240, 240, 239],
...         [250, 249, 237], 
...         [242, 239, 237],
...         [240, 234, 233]]

>>> [list(col) for col in zip(*data)]
[[240, 250, 242, 240],
 [240, 249, 239, 234],
 [239, 237, 237, 233]]

By performing sum() on each of those slices, you effectively get the column-wise sum. Simply divide by the length of the column to get the mean.

Side point: In Python 2.x, division on integers floors the decimal by default, which is why float() is called to "promote" the result to a floating point type.


data = [[240, 240, 239],
        [250, 249, 237], 
        [242, 239, 237],
        [240, 234, 233]]
avg = [float(sum(col))/len(col) for col in zip(*data)]
# [243.0, 240.5, 236.5]

This works because zip(*data) will give you a list with the columns grouped, the float() call is only necessary on Python 2.x, which uses integer division unless from __future__ import division is used.

Tags:

Python