Quickest way to compare two generic lists for differences

More efficient would be using Enumerable.Except:

var inListButNotInList2 = list.Except(list2);
var inList2ButNotInList = list2.Except(list);

This method is implemented by using deferred execution. That means you could write for example:

var first10 = inListButNotInList2.Take(10);

It is also efficient since it internally uses a Set<T> to compare the objects. It works by first collecting all distinct values from the second sequence, and then streaming the results of the first, checking that they haven't been seen before.

If you want the results to be case insensitive, the following will work:

List<string> list1 = new List<string> { "a.dll", "b1.dll" };
List<string> list2 = new List<string> { "A.dll", "b2.dll" };

var firstNotSecond = list1.Except(list2, StringComparer.OrdinalIgnoreCase).ToList();
var secondNotFirst = list2.Except(list1, StringComparer.OrdinalIgnoreCase).ToList();

firstNotSecond would contain b1.dll

secondNotFirst would contain b2.dll

Enumerable.SequenceEqual Method

Determines whether two sequences are equal according to an equality comparer. MS.Docs

Enumerable.SequenceEqual(list1, list2);

This works for all primitive data types. If you need to use it on custom objects you need to implement IEqualityComparer

Defines methods to support the comparison of objects for equality.

IEqualityComparer Interface

Defines methods to support the comparison of objects for equality. MS.Docs for IEqualityComparer

Use Except:

var firstNotSecond = list1.Except(list2).ToList();
var secondNotFirst = list2.Except(list1).ToList();

I suspect there are approaches which would actually be marginally faster than this, but even this will be vastly faster than your O(N * M) approach.

If you want to combine these, you could create a method with the above and then a return statement:

return !firstNotSecond.Any() && !secondNotFirst.Any();

One point to note is that there is a difference in results between the original code in the question and the solution here: any duplicate elements which are only in one list will only be reported once with my code, whereas they'd be reported as many times as they occur in the original code.

For example, with lists of [1, 2, 2, 2, 3] and [1], the "elements in list1 but not list2" result in the original code would be [2, 2, 2, 3]. With my code it would just be [2, 3]. In many cases that won't be an issue, but it's worth being aware of.

Quickest way to compare two generic lists for differences

Tags:

C#

Linq

List

Related

Recent Posts