Why is `multiprocessing.Queue.get` so slow?
For future readers, you could also try using:
q = multiprocessing.Manager().Queue()
Instead of just
q = multiprocessing.Queue()
I haven't yet fully distilled and understood the mechanisms behind this behavior, but one source I've read claimed it's about:
"when pushing large items onto the queue, the items are essentially buffered, despite the immediate return of the queue’s put function."
The author goes on explaining more about it and a way to fix, but for me, adding the Manager did the trick easy and clean.
UPDATE: I believe this StackOverflow answer is helpful in explaining the issue.
FMQ, mentioned in the accepted answer, is also Python2 exclusive, which is one of the reasons I felt this answer could maybe help more people someday.
I met this problem too. I was sending large numpy arrays (~300MB), and it was so slow at mp.queue.get().
After some look into the python2.7 source code of mp.Queue, I found the slowest part (on unix-like systems) is _conn_recvall()
in socket_connection.c, but I was not looking deeper.
To workaround the problem I build an experimental package FMQ.
This project is inspired by the use of multiprocessing.Queue (mp.Queue). mp.Queue is slow for large data item because of the speed limitation of pipe (on Unix-like systems).
With mp.Queue handling the inter-process transfer, FMQ implements a stealer thread, which steals an item from mp.Queue once any item is available, and puts it into a Queue.Queue. Then, the consumer process can fetch the data from the Queue.Queue immediately.
The speed-up is based on the assumption that both producer and consumer processes are compute-intensive (thus multiprocessing is neccessary) and the data is large (eg. >50 227x227 images). Otherwise mp.Queue with multiprocessing or Queue.Queue with threading is good enough.
fmq.Queue is used easily like a mp.Queue.
Note that there are still some Known Issues, as this project is at its early stage.