Parallel download using Curl command line utility

Well, curl is just a simple UNIX process. You can have as many of these curl processes running in parallel and sending their outputs to different files.

curl can use the filename part of the URL to generate the local file. Just use the -O option (man curl for details).

You could use something like the following

urls="http://example.com/?page1.html http://example.com?page2.html" # add more URLs here

for url in $urls; do
   # run the curl job in the background so we can start another job
   # and disable the progress bar (-s)
   echo "fetching $url"
   curl $url -O -s &
done
wait #wait for all background jobs to terminate

My answer is a bit late, but I believe all of the existing answers fall just a little short. The way I do things like this is with xargs, which is capable of running a specified number of commands in subprocesses.

The one-liner I would use is, simply:

$ seq 1 10 | xargs -n1 -P2 bash -c 'i=$0; url="http://example.com/?page${i}.html"; curl -O -s $url'

This warrants some explanation. The use of -n 1 instructs xargs to process a single input argument at a time. In this example, the numbers 1 ... 10 are each processed separately. And -P 2 tells xargs to keep 2 subprocesses running all the time, each one handling a single argument, until all of the input arguments have been processed.

You can think of this as MapReduce in the shell. Or perhaps just the Map phase. Regardless, it's an effective way to get a lot of work done while ensuring that you don't fork bomb your machine. It's possible to do something similar in a for loop in a shell, but end up doing process management, which starts to seem pretty pointless once you realize how insanely great this use of xargs is.

Update: I suspect that my example with xargs could be improved (at least on Mac OS X and BSD with the -J flag). With GNU Parallel, the command is a bit less unwieldy as well:

parallel --jobs 2 curl -O -s http://example.com/?page{}.html ::: {1..10}

Parallel download using Curl command line utility

Tags:

Linux

Unix

Shell

Curl

Related

Recent Posts