Bash script multithreading in curl commands
You are experiencing the problem of appending to a file in parallel. The easy answer is: Don't.
Here is how you can do it using GNU Parallel:
doit() {
url="$1"
uri="$2"
urlstatus=$(curl -o /dev/null --insecure --silent --head --write-out '%{http_code}' "${url}""${uri}" --max-time 5 ) &&
echo "$url $urlstatus $uri"
}
export -f doit
parallel -j200 doit :::: url uri >> urlstatus.txt
GNU Parallel defaults to serializing the output, so you will not get output from one job that is mixed with output from another.
GNU Parallel makes it easy to get the input included in the output using --tag
. So unless the output format is fixed, I would do:
parallel --tag -j200 curl -o /dev/null --insecure --silent --head --write-out '%{http_code}' {1}{2} --max-time 5 :::: url uri >> urlstatus.txt
It will give the same output - just formatted differently. Instead of:
url urlstatus uri
you get:
url uri urlstatus