Comparing two collections for equality irrespective of the order of items in them
It turns out Microsoft already has this covered in its testing framework: CollectionAssert.AreEquivalent
Remarks
Two collections are equivalent if they have the same elements in the same quantity, but in any order. Elements are equal if their values are equal, not if they refer to the same object.
Using reflector, I modified the code behind AreEquivalent() to create a corresponding equality comparer. It is more complete than existing answers, since it takes nulls into account, implements IEqualityComparer and has some efficiency and edge case checks. plus, it's Microsoft :)
public class MultiSetComparer<T> : IEqualityComparer<IEnumerable<T>>
{
private readonly IEqualityComparer<T> m_comparer;
public MultiSetComparer(IEqualityComparer<T> comparer = null)
{
m_comparer = comparer ?? EqualityComparer<T>.Default;
}
public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
{
if (first == null)
return second == null;
if (second == null)
return false;
if (ReferenceEquals(first, second))
return true;
if (first is ICollection<T> firstCollection && second is ICollection<T> secondCollection)
{
if (firstCollection.Count != secondCollection.Count)
return false;
if (firstCollection.Count == 0)
return true;
}
return !HaveMismatchedElement(first, second);
}
private bool HaveMismatchedElement(IEnumerable<T> first, IEnumerable<T> second)
{
int firstNullCount;
int secondNullCount;
var firstElementCounts = GetElementCounts(first, out firstNullCount);
var secondElementCounts = GetElementCounts(second, out secondNullCount);
if (firstNullCount != secondNullCount || firstElementCounts.Count != secondElementCounts.Count)
return true;
foreach (var kvp in firstElementCounts)
{
var firstElementCount = kvp.Value;
int secondElementCount;
secondElementCounts.TryGetValue(kvp.Key, out secondElementCount);
if (firstElementCount != secondElementCount)
return true;
}
return false;
}
private Dictionary<T, int> GetElementCounts(IEnumerable<T> enumerable, out int nullCount)
{
var dictionary = new Dictionary<T, int>(m_comparer);
nullCount = 0;
foreach (T element in enumerable)
{
if (element == null)
{
nullCount++;
}
else
{
int num;
dictionary.TryGetValue(element, out num);
num++;
dictionary[element] = num;
}
}
return dictionary;
}
public int GetHashCode(IEnumerable<T> enumerable)
{
if (enumerable == null) throw new
ArgumentNullException(nameof(enumerable));
int hash = 17;
foreach (T val in enumerable)
hash ^= (val == null ? 42 : m_comparer.GetHashCode(val));
return hash;
}
}
Sample usage:
var set = new HashSet<IEnumerable<int>>(new[] {new[]{1,2,3}}, new MultiSetComparer<int>());
Console.WriteLine(set.Contains(new [] {3,2,1})); //true
Console.WriteLine(set.Contains(new [] {1, 2, 3, 3})); //false
Or if you just want to compare two collections directly:
var comp = new MultiSetComparer<string>();
Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","c","b"})); //true
Console.WriteLine(comp.Equals(new[] {"a","b","c"}, new[] {"a","b"})); //false
Finally, you can use your an equality comparer of your choice:
var strcomp = new MultiSetComparer<string>(StringComparer.OrdinalIgnoreCase);
Console.WriteLine(strcomp.Equals(new[] {"a", "b"}, new []{"B", "A"})); //true
A simple and fairly efficient solution is to sort both collections and then compare them for equality:
bool equal = collection1.OrderBy(i => i).SequenceEqual(
collection2.OrderBy(i => i));
This algorithm is O(N*logN), while your solution above is O(N^2).
If the collections have certain properties, you may be able to implement a faster solution. For example, if both of your collections are hash sets, they cannot contain duplicates. Also, checking whether a hash set contains some element is very fast. In that case an algorithm similar to yours would likely be fastest.
Create a Dictionary "dict" and then for each member in the first collection, do dict[member]++;
Then, loop over the second collection in the same way, but for each member do dict[member]--.
At the end, loop over all of the members in the dictionary:
private bool SetEqual (List<int> left, List<int> right) {
if (left.Count != right.Count)
return false;
Dictionary<int, int> dict = new Dictionary<int, int>();
foreach (int member in left) {
if (dict.ContainsKey(member) == false)
dict[member] = 1;
else
dict[member]++;
}
foreach (int member in right) {
if (dict.ContainsKey(member) == false)
return false;
else
dict[member]--;
}
foreach (KeyValuePair<int, int> kvp in dict) {
if (kvp.Value != 0)
return false;
}
return true;
}
Edit: As far as I can tell this is on the same order as the most efficient algorithm. This algorithm is O(N), assuming that the Dictionary uses O(1) lookups.