a faster way to download multiple files
Execute the downloads concurrently instead of sequentially, and set a sensible MaxDegreeOfParallelism otherwise you will try to make too many simultaneous request which will look like a DOS attack:
public static void Main(string[] args)
{
var urls = new List<string>();
Parallel.ForEach(
urls,
new ParallelOptions{MaxDegreeOfParallelism = 10},
DownloadFile);
}
public static void DownloadFile(string url)
{
using(var sr = new StreamReader(HttpWebRequest.Create(url)
.GetResponse().GetResponseStream()))
using(var sw = new StreamWriter(url.Substring(url.LastIndexOf('/'))))
{
sw.Write(sr.ReadToEnd());
}
}
Download files in several threads. Number of threads depends on your throughput. Also, look at WebClient
and HttpWebRequest
classes. Simple sample:
var list = new[]
{
"http://google.com",
"http://yahoo.com",
"http://stackoverflow.com"
};
var tasks = Parallel.ForEach(list,
s =>
{
using (var client = new WebClient())
{
Console.WriteLine($"starting to download {s}");
string result = client.DownloadString((string)s);
Console.WriteLine($"finished downloading {s}");
}
});
I'd use several threads in parallel, with a WebClient
. I recommend setting the max degree of parallelism to the number of threads you want, since unspecified degree of parallelism doesn't work well for long running tasks. I've used 50 parallel downloads in one of my projects without a problem, but depending on the speed of an individual download a much lower might be sufficient.
If you download multiple files in parallel from the same server, you're by default limited to a small number (2 or 4) of parallel downloads. While the http standard specifies such a low limit, many servers don't enforce it. Use ServicePointManager.DefaultConnectionLimit = 10000;
to increase the limit.