python parallel map (multiprocessing.Pool.map) with global data
You need the list glob_data
to be backed by shared memory, Multiprocessing's Manager gives you just that:
import multiprocessing as multi
from multiprocessing import Manager
manager = Manager()
glob_data = manager.list([])
def func(a):
glob_data.append(a)
map(func,range(10))
print glob_data # [0,1,2,3,4 ... , 9] Good.
p = multi.Pool(processes=8)
p.map(func,range(80))
print glob_data # Super Good.
For some background:
https://docs.python.org/3/library/multiprocessing.html#managers
Have func return a tuple with the results you want from the processing and the thing you want to append to glob_data. Then, when the p.map has completed, you can extract the results from the first elements in the returned tuples and you can build glob_data from the second elements.