How to run given function in Bash in parallel?
An efficient solution that can also run multi-line commands in parallel:
for ...your_loop...; do
if test "$(jobs | wc -l)" -ge 8; then
wait -n
fi
{
command1
command2
...
} &
done
wait
In your case:
for i in "${list[@]}"
do
for j in "${other[@]}"
do
if test "$(jobs | wc -l)" -ge 8; then
wait -n
fi
{
your
commands
here
} &
done
done
wait
If there are 8 bash jobs already running, wait
will wait for at least one job to complete. If/when there are less jobs, it starts new ones asynchronously.
Benefits of this approach:
- It's very easy for multi-line commands. All your variables are automatically "captured" in scope, no need to pass them around as arguments
- It's relatively fast. Compare this, for example, to parallel (I'm quoting official
man
):
parallel is slow at starting up - around 250 ms the first time and 150 ms after that.
- Only needs
bash
to work.
Downsides:
- There is a possibility that there were 8 jobs when we counted them, but less when we started waiting. (It happens if a jobs finishes in those milliseconds between the two commands.) This can make us
wait
with fewer jobs than required. However, it will resume when at least one job completes, or immediately if there are 0 jobs running (wait -n
exits immediately in this case). - If you already have some commands running asynchronously (
&
) within the same bash script, you'll have fewer worker processes in the loop.
Edit: Please consider Ole's answer instead.
Instead of a separate script, you can put your code in a separate bash function. You can then export it, and run it via xargs:
#!/bin/bash
dowork() {
sleep $((RANDOM % 10 + 1))
echo "Processing i=$1, j=$2"
}
export -f dowork
for i in "${list[@]}"
do
for j in "${other[@]}"
do
printf "%s\0%s\0" "$i" "$j"
done
done | xargs -0 -n 2 -P 4 bash -c 'dowork "$@"' --
sem
is part of GNU Parallel and is made for this kind of situation.
for i in "${list[@]}"
do
for j in "${other[@]}"
do
# some processing in here - 20-30 lines of almost pure bash
sem -j 4 dolong task
done
done
If you like the function better GNU Parallel can do the dual for loop in one go:
dowork() {
echo "Starting i=$1, j=$2"
sleep 5
echo "Done i=$1, j=$2"
}
export -f dowork
parallel dowork ::: "${list[@]}" ::: "${other[@]}"