Caching IEnumerable
You can look at Saving the State of Enumerators which describes how to create lazy list (which caches once iterated items).
Check out MemoizeAll()
in the Reactive Extensions for .NET library (Rx). As it is evaluated lazily you can safely set it up during construction and just return Modules
from ListModules()
:
Modules = Source.
Descendants("Module").
Select(m => new ModuleData(m.Element("ModuleID").Value, 1, 1)).
MemoizeAll();
There's a good explanation of MemoizeAll()
(and some of the other less obvious Rx extensions) here.
I like @tsemer's answer. But I would like to propose my solutions, which has nothing to do with FP. It's naive approach, but it generates a lot less of allocations. And it is not thread safe.
public class CachedEnumerable<T> : IEnumerable<T>, IDisposable
{
IEnumerator<T> _enumerator;
readonly List<T> _cache = new List<T>();
public CachedEnumerable(IEnumerable<T> enumerable)
: this(enumerable.GetEnumerator())
{
}
public CachedEnumerable(IEnumerator<T> enumerator)
{
_enumerator = enumerator;
}
public IEnumerator<T> GetEnumerator()
{
// The index of the current item in the cache.
int index = 0;
// Enumerate the _cache first
for (; index < _cache.Count; index++)
{
yield return _cache[index];
}
// Continue enumeration of the original _enumerator,
// until it is finished.
// This adds items to the cache and increment
for (; _enumerator != null && _enumerator.MoveNext(); index++)
{
var current = _enumerator.Current;
_cache.Add(current);
yield return current;
}
if (_enumerator != null)
{
_enumerator.Dispose();
_enumerator = null;
}
// Some other users of the same instance of CachedEnumerable
// can add more items to the cache,
// so we need to enumerate them as well
for (; index < _cache.Count; index++)
{
yield return _cache[index];
}
}
public void Dispose()
{
if (_enumerator != null)
{
_enumerator.Dispose();
_enumerator = null;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
This is how the matrix test from @tsemer's answer will work:
var ints = new [] { 1, 2, 3, 4, 5 };
var cachedEnumerable = new CachedEnumerable<int>(ints);
foreach (var x in cachedEnumerable)
{
foreach (var y in cachedEnumerable)
{
//Do something
}
}
- The outer loop (
x
) skips firstfor
, because_cache
is empty; x
fetches one item from the_enumerator
to the_cache
;x
pauses before secondfor
loop;- The inner loop (
y
) enumerates one element from the_cache
; y
fetches all elements from the_enumerator
to the_cache
;y
skips the thirdfor
loop, because itsindex
variable equals5
;x
resumes, itsindex
equals1
. It skips the secondfor
loop because_enumerator
is finished;x
enumerates one element from the_cache
using the thirdfor
loop;x
pauses before the thirdfor
;y
enumerates 5 elements from the_cache
using firstfor
loop;y
skips the secondfor
loop, because_enumerator
is finished;y
skips the thirdfor
loop, becauseindex
ofy
equals5
;x
resumes, incrementsindex
. It fetches one element from the_cache
using the thirdfor
loop.x
pauses.- if
index
variable ofx
is less than5
then go to 10; - end.
I've seen a handful of implementations out there, some older and not taking advantage of newest .Net classes, some too elaborate for my needs. I ended up with the most concise and declarative code I could muster, which added up to a class with roughly 15 lines of (actual) code. It seems to align well with OP's needs:
Edit: Second revision, better support for empty enumerables
/// <summary>
/// A <see cref="IEnumerable{T}"/> that caches every item upon first enumeration.
/// </summary>
/// <seealso cref="http://blogs.msdn.com/b/matt/archive/2008/03/14/digging-deeper-into-lazy-and-functional-c.aspx"/>
/// <seealso cref="http://blogs.msdn.com/b/wesdyer/archive/2007/02/13/the-virtues-of-laziness.aspx"/>
public class CachedEnumerable<T> : IEnumerable<T> {
private readonly bool _hasItem; // Needed so an empty enumerable will not return null but an actual empty enumerable.
private readonly T _item;
private readonly Lazy<CachedEnumerable<T>> _nextItems;
/// <summary>
/// Initialises a new instance of <see cref="CachedEnumerable{T}"/> using <paramref name="item"/> as the current item
/// and <paramref name="nextItems"/> as a value factory for the <see cref="CachedEnumerable{T}"/> containing the next items.
/// </summary>
protected internal CachedEnumerable(T item, Func<CachedEnumerable<T>> nextItems) {
_hasItem = true;
_item = item;
_nextItems = new Lazy<CachedEnumerable<T>>(nextItems);
}
/// <summary>
/// Initialises a new instance of <see cref="CachedEnumerable{T}"/> with no current item and no next items.
/// </summary>
protected internal CachedEnumerable() {
_hasItem = false;
}
/// <summary>
/// Instantiates and returns a <see cref="CachedEnumerable{T}"/> for a given <paramref name="enumerable"/>.
/// Notice: The first item is always iterated through.
/// </summary>
public static CachedEnumerable<T> Create(IEnumerable<T> enumerable) {
return Create(enumerable.GetEnumerator());
}
/// <summary>
/// Instantiates and returns a <see cref="CachedEnumerable{T}"/> for a given <paramref name="enumerator"/>.
/// Notice: The first item is always iterated through.
/// </summary>
private static CachedEnumerable<T> Create(IEnumerator<T> enumerator) {
return enumerator.MoveNext() ? new CachedEnumerable<T>(enumerator.Current, () => Create(enumerator)) : new CachedEnumerable<T>();
}
/// <summary>
/// Returns an enumerator that iterates through the collection.
/// </summary>
public IEnumerator<T> GetEnumerator() {
if (_hasItem) {
yield return _item;
var nextItems = _nextItems.Value;
if (nextItems != null) {
foreach (var nextItem in nextItems) {
yield return nextItem;
}
}
}
}
/// <summary>
/// Returns an enumerator that iterates through a collection.
/// </summary>
IEnumerator IEnumerable.GetEnumerator() {
return GetEnumerator();
}
}
A useful extension method could be:
public static class IEnumerableExtensions {
/// <summary>
/// Instantiates and returns a <see cref="CachedEnumerable{T}"/> for a given <paramref name="enumerable"/>.
/// Notice: The first item is always iterated through.
/// </summary>
public static CachedEnumerable<T> ToCachedEnumerable<T>(this IEnumerable<T> enumerable) {
return CachedEnumerable<T>.Create(enumerable);
}
}
And for the unit testers amongst you: (if you don't use resharper just take out the [SuppressMessage]
attributes)
/// <summary>
/// Tests the <see cref="CachedEnumerable{T}"/> class.
/// </summary>
[TestFixture]
public class CachedEnumerableTest {
private int _count;
/// <remarks>
/// This test case is only here to emphasise the problem with <see cref="IEnumerable{T}"/> which <see cref="CachedEnumerable{T}"/> attempts to solve.
/// </remarks>
[Test]
[SuppressMessage("ReSharper", "PossibleMultipleEnumeration")]
[SuppressMessage("ReSharper", "ReturnValueOfPureMethodIsNotUsed")]
public void MultipleEnumerationAreNotCachedForOriginalIEnumerable() {
_count = 0;
var enumerable = Enumerable.Range(1, 40).Select(IncrementCount);
enumerable.Take(3).ToArray();
enumerable.Take(10).ToArray();
enumerable.Take(4).ToArray();
Assert.AreEqual(17, _count);
}
/// <remarks>
/// This test case is only here to emphasise the problem with <see cref="IList{T}"/> which <see cref="CachedEnumerable{T}"/> attempts to solve.
/// </remarks>
[Test]
[SuppressMessage("ReSharper", "PossibleMultipleEnumeration")]
[SuppressMessage("ReSharper", "ReturnValueOfPureMethodIsNotUsed")]
public void EntireListIsEnumeratedForOriginalListOrArray() {
_count = 0;
Enumerable.Range(1, 40).Select(IncrementCount).ToList();
Assert.AreEqual(40, _count);
_count = 0;
Enumerable.Range(1, 40).Select(IncrementCount).ToArray();
Assert.AreEqual(40, _count);
}
[Test]
[SuppressMessage("ReSharper", "ReturnValueOfPureMethodIsNotUsed")]
public void MultipleEnumerationsAreCached() {
_count = 0;
var cachedEnumerable = Enumerable.Range(1, 40).Select(IncrementCount).ToCachedEnumerable();
cachedEnumerable.Take(3).ToArray();
cachedEnumerable.Take(10).ToArray();
cachedEnumerable.Take(4).ToArray();
Assert.AreEqual(10, _count);
}
[Test]
public void FreshCachedEnumerableDoesNotEnumerateExceptFirstItem() {
_count = 0;
Enumerable.Range(1, 40).Select(IncrementCount).ToCachedEnumerable();
Assert.AreEqual(1, _count);
}
/// <remarks>
/// Based on Jon Skeet's test mentioned here: http://www.siepman.nl/blog/post/2013/10/09/LazyList-A-better-LINQ-result-cache-than-List.aspx
/// </remarks>
[Test]
[SuppressMessage("ReSharper", "LoopCanBeConvertedToQuery")]
public void MatrixEnumerationIteratesAsExpectedWhileStillKeepingEnumeratedValuesCached() {
_count = 0;
var cachedEnumerable = Enumerable.Range(1, 5).Select(IncrementCount).ToCachedEnumerable();
var matrixCount = 0;
foreach (var x in cachedEnumerable) {
foreach (var y in cachedEnumerable) {
matrixCount++;
}
}
Assert.AreEqual(5, _count);
Assert.AreEqual(25, matrixCount);
}
[Test]
public void OrderingCachedEnumerableWorksAsExpectedWhileStillKeepingEnumeratedValuesCached() {
_count = 0;
var cachedEnumerable = Enumerable.Range(1, 5).Select(IncrementCount).ToCachedEnumerable();
var orderedEnumerated = cachedEnumerable.OrderBy(x => x);
var orderedEnumeratedArray = orderedEnumerated.ToArray(); // Enumerated first time in ascending order.
Assert.AreEqual(5, _count);
for (int i = 0; i < orderedEnumeratedArray.Length; i++) {
Assert.AreEqual(i + 1, orderedEnumeratedArray[i]);
}
var reorderedEnumeratedArray = orderedEnumerated.OrderByDescending(x => x).ToArray(); // Enumerated second time in descending order.
Assert.AreEqual(5, _count);
for (int i = 0; i < reorderedEnumeratedArray.Length; i++) {
Assert.AreEqual(5 - i, reorderedEnumeratedArray[i]);
}
}
private int IncrementCount(int value) {
_count++;
return value;
}
}