python generator: unpack entire generator in parallel

No. You must call next() sequentially because any non-trivial generator's next state is determined by its current state.

def gen(num):
    j=0
    for i in xrange(num):
        j += i
        yield j

There's no way to parallelize calls to the above generator without knowing its state at each point it yields a value. But if you knew that, you wouldn't need to run it.


Assuming the calls to block_parser(b) to be performed in parallel, you could try using a multiprocessing.Pool:

import multiprocessing as mp

pool = mp.Pool()

raw_blocks = block_generator(fin)
parsed_blocks = pool.imap(block_parser, raw_blocks)
data = parsedBlocksToOrderedDict(parsed_blocks)

Note that:

  • If you expect that list(parsed_blocks) can fit entirely in memory, then using pool.map can be much faster than pool.imap.
  • The items in raw_blocks and the return values from block_parse must be pickable since mp.Pool transfers tasks and results through a mp.Queue.