How do I group this list of dicts by the same month?
First, I would sort the data1:
>>> lst = [{'date':'2008-04-23','value':'1'},
... {'date':'2008-04-01','value':'8'},
... {'date':'2008-04-05','value':'3'},
... {'date':'2009-04-19','value':'5'},
... {'date':'2009-04-21','value':'8'},
... {'date':'2010-09-09','value':'3'},
... {'date':'2010-09-10','value':'4'},
... ]
>>> lst.sort(key=lambda x:x['date'][:7])
>>> lst
[{'date': '2008-04-23', 'value': '1'}, {'date': '2008-04-01', 'value': '8'}, {'date': '2008-04-05', 'value': '3'}, {'date': '2009-04-19', 'value': '5'}, {'date': '2009-04-21', 'value': '8'}, {'date': '2010-09-09', 'value': '3'}, {'date': '2010-09-10', 'value': '4'}]
Then, I would use itertools.groupby
to do the grouping:
>>> from itertools import groupby
>>> for k,v in groupby(lst,key=lambda x:x['date'][:7]):
... print k, list(v)
...
2008-04 [{'date': '2008-04-23', 'value': '1'}, {'date': '2008-04-01', 'value': '8'}, {'date': '2008-04-05', 'value': '3'}]
2009-04 [{'date': '2009-04-19', 'value': '5'}, {'date': '2009-04-21', 'value': '8'}]
2010-09 [{'date': '2010-09-09', 'value': '3'}, {'date': '2010-09-10', 'value': '4'}]
>>>
Now, to get the output you wanted:
>>> for k,v in groupby(lst,key=lambda x:x['date'][:7]):
... print {'date':k+'-01','value':sum(int(d['value']) for d in v)}
...
{'date': '2008-04-01', 'value': 12}
{'date': '2009-04-01', 'value': 13}
{'date': '2010-09-01', 'value': 7}
1Your data actually already appears to be sorted in this regard, so you might be able to skip this step.
Use itertools.groupby:
data = [{'date':'2008-04-23','value':'1'},
{'date':'2008-04-01','value':'8'},
{'date':'2008-04-05','value':'3'},
{'date':'2009-04-19','value':'5'},
{'date':'2009-04-21','value':'8'},
{'date':'2010-09-09','value':'3'},
{'date':'2010-09-10','value':'4'},
]
import itertools
key = lambda datum: datum['date'].rsplit('-', 1)[0]
data.sort(key=key)
result = [{
'date': key + '-01',
'value': sum(int(item['value']) for item in group)
} for key, group in itertools.groupby(data, key=key)]
print result
# [{'date': '2008-04-01', 'value': 12},
# {'date': '2009-04-01', 'value': 13},
# {'date': '2010-09-01', 'value': 7}]