Calculating the mode in a multimodal list in Python
In Python >=2.7, use collections.Counter
for frequency tables.
from collections import Counter
from itertools import takewhile
data = [1,1,2,3,4,4]
freq = Counter(data)
mostfreq = freq.most_common()
modes = list(takewhile(lambda x_f: x_f[1] == mostfreq[0][1], mostfreq))
Note the use of an anonymous function (lambda
) that checks whether a pair (_, f)
has the same frequency as the most frequent element.
Well, the first problem is that yes, you're returning the value in frequences
rather than the key. That means you get the count of the mode, not the mode itself. Normally, to get the mode, you'd use the key
keyword argument to max, like so:
>>> max(frequencies, key=counts.get())
But in 2.4 that doesn't exist! Here's an approach that I believe will work in 2.4:
>>> import random
>>> l = [random.randrange(0, 5) for _ in range(50)]
>>> frequencies = {}
>>> for i in l:
... frequencies[i] = frequencies.get(i, 0) + 1
...
>>> frequencies
{0: 11, 1: 13, 2: 8, 3: 8, 4: 10}
>>> mode = max((v, k) for k, v in frequencies.iteritems())[1]
>>> mode
1
>>> max_freq = max(frequencies.itervalues())
>>> modes = [k for k, v in frequencies.iteritems() if v == max_freq]
>>> modes
[1]
I prefer the decorate-sort-undecorate idiom to the cmp
keyword. I think it's more readable. Could be that's just me.
Note that starting in Python 3.8
, the standard library includes the statistics.multimode
function to return a list of the most frequently occurring values in the order they were first encountered:
from statistics import multimode
multimode([1, 1, 2, 3, 4, 4])
# [1, 4]