We are using multiprocessing with the spawn start method. On my 32-thread PC, starting all worker processes for my project used to take 2 seconds. At a certain point, it jumped straight to taking 20 seconds.
The slowdown appears as soon as more than 64 KB needs to be sent to a child process over the pipe.
Of course, changing the pipe size will only delay the onset of the problem. The real solution (if there is any) will probably be different. Blindly setting a pipe size might also not be safe as it depends on limits set in /proc.
The example above is a best case example, since it has very limited pickle overhead. We hit this limit without any data caches involved. It's just our Python objects that live after application initialization. They are slower to pickle. However, then things are still 10x slower, so not just a fixed 80ms as seen in the example.
We use spawn instead of fork on Linux to avoid troubles with objects that cannot be pickled on other OSs (Windows).
Your environment
CPython versions tested on: 3.9 and 3.10
Operating system and architecture: Arch Linux (kernel 5.19.9-arch1-1), x86_64
The text was updated successfully, but these errors were encountered:
jhoekx commentedSep 20, 2022
Bug report
We are using multiprocessing with the
spawnstart method. On my 32-thread PC, starting all worker processes for my project used to take 2 seconds. At a certain point, it jumped straight to taking 20 seconds.The slowdown appears as soon as more than 64 KB needs to be sent to a child process over the pipe.
Consider this minimal reproduction case:
I added some "instrumentation" in
multiprocessing/popen_spawn_posix.pyto print the buffer size:Running the example results in:
Changing the pipe size with
fcntlinmultiprocessing/popen_spawn_posix.pyrestores performance:Where 1031 is
fcntl.F_SETPIPE_SZ, which is not in Python 3.9.Rerunning the reproduction case after this change:
Of course, changing the pipe size will only delay the onset of the problem. The real solution (if there is any) will probably be different. Blindly setting a pipe size might also not be safe as it depends on limits set in
/proc.The example above is a best case example, since it has very limited pickle overhead. We hit this limit without any data caches involved. It's just our Python objects that live after application initialization. They are slower to pickle. However, then things are still 10x slower, so not just a fixed 80ms as seen in the example.
We use
spawninstead offorkon Linux to avoid troubles with objects that cannot be pickled on other OSs (Windows).Your environment
The text was updated successfully, but these errors were encountered: