How to split dictionary into multiple dictionaries fast
Another method is iterators zipping:
>>> from itertools import izip_longest, ifilter
>>> d = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8}
Create a list with copies of dict iterators (number of copies is number of elements in result dicts). By passing each iterator from chunks
list to izip_longest
you will get needed number of elements from source dict (ifilter
used to remove None
from zip results). With generator expression you can lower memory usage:
>>> chunks = [d.iteritems()]*3
>>> g = (dict(ifilter(None, v)) for v in izip_longest(*chunks))
>>> list(g)
[{'a': 1, 'c': 3, 'b': 2},
{'e': 5, 'd': 4, 'g': 7},
{'h': 8, 'f': 6}]
For Python 3+.
xrange()
was renamed to range()
in Python 3+.
You can use;
from itertools import islice
def chunks(data, SIZE=10000):
it = iter(data)
for i in range(0, len(data), SIZE):
yield {k:data[k] for k in islice(it, SIZE)}
Sample:
for item in chunks({i:i for i in range(10)}, 3):
print(item)
With following output.
{0: 0, 1: 1, 2: 2}
{3: 3, 4: 4, 5: 5}
{6: 6, 7: 7, 8: 8}
{9: 9}
Since the dictionary is so big, it would be better to keep all the items involved to be just iterators and generators, like this
from itertools import islice
def chunks(data, SIZE=10000):
it = iter(data)
for i in range(0, len(data), SIZE):
yield {k:data[k] for k in islice(it, SIZE)}
Sample run:
for item in chunks({i:i for i in xrange(10)}, 3):
print(item)
Output
{0: 0, 1: 1, 2: 2}
{3: 3, 4: 4, 5: 5}
{8: 8, 6: 6, 7: 7}
{9: 9}