How to loop through IEnumerable in batches
Sounds like you need to use Skip and Take methods of your object. Example:
users.Skip(1000).Take(1000)
this would skip the first 1000 and take the next 1000. You'd just need to increase the amount skipped with each call
You could use an integer variable with the parameter for Skip and you can adjust how much is skipped. You can then call it in a method.
public IEnumerable<user> GetBatch(int pageNumber)
{
return users.Skip(pageNumber * 1000).Take(1000);
}
The easiest way to do this is probably just to use the GroupBy
method in LINQ:
var batches = myEnumerable
.Select((x, i) => new { x, i })
.GroupBy(p => (p.i / 1000), (p, i) => p.x);
But for a more sophisticated solution, see this blog post on how to create your own extension method to do this. Duplicated here for posterity:
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> collection, int batchSize)
{
List<T> nextbatch = new List<T>(batchSize);
foreach (T item in collection)
{
nextbatch.Add(item);
if (nextbatch.Count == batchSize)
{
yield return nextbatch;
nextbatch = new List<T>();
// or nextbatch.Clear(); but see Servy's comment below
}
}
if (nextbatch.Count > 0)
yield return nextbatch;
}
You can use MoreLINQ's Batch operator (available from NuGet):
foreach(IEnumerable<User> batch in users.Batch(1000))
// use batch
If simple usage of library is not an option, you can reuse implementation:
public static IEnumerable<IEnumerable<T>> Batch<T>(
this IEnumerable<T> source, int size)
{
T[] bucket = null;
var count = 0;
foreach (var item in source)
{
if (bucket == null)
bucket = new T[size];
bucket[count++] = item;
if (count != size)
continue;
yield return bucket.Select(x => x);
bucket = null;
count = 0;
}
// Return the last bucket with all remaining elements
if (bucket != null && count > 0)
{
Array.Resize(ref bucket, count);
yield return bucket.Select(x => x);
}
}
BTW for performance you can simply return bucket without calling Select(x => x)
. Select is optimized for arrays, but selector delegate still would be invoked on each item. So, in your case it's better to use
yield return bucket;