Python Multiprocessing Locks
I think the reason is that the multiprocessing pool uses pickle
to transfer objects between the processes. However, a Lock
cannot be pickled:
>>> import multiprocessing
>>> import pickle
>>> lock = multiprocessing.Lock()
>>> lp = pickle.dumps(lock)
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
lp = pickle.dumps(lock)
...
RuntimeError: Lock objects should only be shared between processes through inheritance
>>>
See the "Picklability" and "Better to inherit than pickle/unpickle" sections of https://docs.python.org/2/library/multiprocessing.html#all-platforms
Other answers already provide the answer that the apply_async
silently fails unless an appropriate error_callback
argument is provided. I still found OP's other point valid -- the official docs do indeed show multiprocessing.Lock
being passed around as a function argument. In fact, the sub-section titled "Explicitly pass resources to child processes" in Programming guidelines recommends passing a multiprocessing.Lock
object as function argument instead of a global variable. And, I have been writing a lot of code in which I pass a multiprocessing.Lock
as an argument to the child process and it all works as expected.
So, what gives?
I first investigated whether multiprocessing.Lock
is pickle-able or not. In Python 3, MacOS+CPython, trying to pickle multiprocessing.Lock
produces the familiar RuntimeError
encountered by others.
>>> pickle.dumps(multiprocessing.Lock())
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-7-66dfe1355652> in <module>
----> 1 pickle.dumps(multiprocessing.Lock())
/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/synchronize.py in __getstate__(self)
99
100 def __getstate__(self):
--> 101 context.assert_spawning(self)
102 sl = self._semlock
103 if sys.platform == 'win32':
/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/context.py in assert_spawning(obj)
354 raise RuntimeError(
355 '%s objects should only be shared between processes'
--> 356 ' through inheritance' % type(obj).__name__
357 )
RuntimeError: Lock objects should only be shared between processes through inheritance
To me, this confirms that multiprocessing.Lock
is indeed not pickle-able.
Aside begins
But, the same lock still needs to be shared across two or more python processes which will have their own, potentially different address spaces (such as when we use "spawn" or "forkserver" as start methods). multiprocessing
must be doing something special to send Lock across processes. This other StackOverflow post seems to indicate that in Unix systems, multiprocessing.Lock
may be implemented via named semaphores that are supported by the OS itself (outside python). Two or more python processes can then link to the same lock that effectively resides in one location outside both python processes. There may be a shared memory implementation as well.
Aside ends
Can we pass multiprocessing.Lock
object as an argument or not?
After a few more experiments and more reading, it appears that the difference is between multiprocessing.Pool
and multiprocessing.Process
.
multiprocessing.Process
lets you pass multiprocessing.Lock
as an argument but multiprocessing.Pool
doesn't. Here is an example that works:
import multiprocessing
import time
from multiprocessing import Process, Lock
def task(n: int, lock):
with lock:
print(f'n={n}')
time.sleep(0.25)
if __name__ == '__main__':
multiprocessing.set_start_method('forkserver')
lock = Lock()
processes = [Process(target=task, args=(i, lock)) for i in range(20)]
for process in processes:
process.start()
for process in processes:
process.join()
Note the use of __name__ == '__main__'
is essential as mentioned in the "Safe importing of main module" sub-section of Programming guidelines.
multiprocessing.Pool
seems to use queue.SimpleQueue
which puts each task in a queue and that's where pickling happens. Most likely, multiprocessing.Process
is not using pickling (or doing a special version of pickling).
If you change pool.apply_async
to pool.apply
, you get this exception:
Traceback (most recent call last):
File "p.py", line 15, in <module>
pool.apply(job, [l, i])
File "/usr/lib/python2.7/multiprocessing/pool.py", line 244, in apply
return self.apply_async(func, args, kwds).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
raise self._value
RuntimeError: Lock objects should only be shared between processes through inheritance
pool.apply_async
is just hiding it. I hate to say this, but using a global variable is probably the simplest way for your example. Let's just hope the velociraptors don't get you.