Most efficient (and pythonic) way to count False values in 2D numpy arrays?
Use count_nonzero
to count non-zero (e.g. not False
) values:
>>> np.size(a) - np.count_nonzero(a)
2
The clearer is surely to ask exactly what is needed, but that doesn't mean it is the most efficient:
Using %%timeit
in jupyter
with python 2.7 on the proposed answers gives a clear winner:
seq = [[True, True, False, True, False, False, False] * 10 for _ in range(100)]
a = np.array(seq)
np.size(a) - np.count_nonzero(a) 1000000 loops, best of 3: 1.34 µs per loop - Antti Haapala
(~a).sum() 100000 loops, best of 3: 18.5 µs per loop - Paul H
np.size(a) - np.sum(a) 10000 loops, best of 3: 18.8 µs per loop - OP
len(a[a == False]) 10000 loops, best of 3: 52.4 µs per loop
len(np.where(a==False)) 10000 loops, best of 3: 77 µs per loop - Forzaa
.
The clear winner is Antti Haapala, by an order of magnitude, with np.size(a) - np.count_nonzero(a)
len(np.where(a==False))
seems to be penalized by the nested structure of the array; the same benchmark on a 1 D array gives 10000 loops, best of 3: 27 µs per loop
This would do that:
len(np.where(a==False))
Maybe there are other ways that are faster or look better.