Normalizing dictionary values
Try this to modify in place:
d={'a':0.2, 'b':0.3}
factor=1.0/sum(d.itervalues())
for k in d:
d[k] = d[k]*factor
result:
>>> d
{'a': 0.4, 'b': 0.6}
Alternatively to modify into a new dictionary, use a dict comprehension:
d={'a':0.2, 'b':0.3}
factor=1.0/sum(d.itervalues())
normalised_d = {k: v*factor for k, v in d.iteritems() }
Note the use of d.iteritems() which uses less memory than d.items(), so is better for a large dictionary.
EDIT: Since there are quite a few of them, and getting this right seems to be important, I've summarised all the ideas in the comments to this answer together to the following (including borrowing something from this post):
import math
import operator
def really_safe_normalise_in_place(d):
factor=1.0/math.fsum(d.itervalues())
for k in d:
d[k] = d[k]*factor
key_for_max = max(d.iteritems(), key=operator.itemgetter(1))[0]
diff = 1.0 - math.fsum(d.itervalues())
#print "discrepancy = " + str(diff)
d[key_for_max] += diff
d={v: v+1.0/v for v in xrange(1, 1000001)}
really_safe_normalise_in_place(d)
print math.fsum(d.itervalues())
Took a couple of goes to come up with dictionary that actually created a non-zero error when normalising but hope this illustrates the point.
EDIT: For Python 3.0. see the following change: Python 3.0 Wiki Built-in Changes
Remove
dict.iteritems()
,dict.iterkeys()
, anddict.itervalues()
.Instead: use
dict.items()
,dict.keys()
, anddict.values()
respectively.
def normalize(d, target=1.0):
raw = sum(d.values())
factor = target/raw
return {key:value*factor for key,value in d.iteritems()}
Use it like this:
>>> data = {'a': 0.2, 'b': 0.3, 'c': 1.5}
>>> normalize(data)
{'b': 0.15, 'c': 0.75, 'a': 0.1}