Intersection of two sets in most optimized way

Well, if you use LINQ's Intersect method it will build up a HashSet of the second sequence, and then check each element of the first sequence against it. So it's O(M+N)... and you can use foo.Intersect(bar).Any() to get an early-out.

Of course, if you store one (either) set in a HashSet<T> to start with, you can just iterate over the other one checking for containment on each step. You'd still need to build the set to start with though.

Fundamentally you've got an O(M+N) problem whatever you do - you're not going to get cheaper than that (there's always the possibility that you'll have to look at every element) and if your hash codes are reasonable, you should be able to achieve that complexity easily. Of course, some solutions may give better constant factors than others... but that's performance rather than complexity ;)

EDIT: As noted in the comments, there's also ISet<T>.Overlaps - if you've already got either set with a static type of ISet<T> or a concrete implementation, calling Overlaps makes it clearer what you're doing. If both of your sets are statically typed as ISet<T>, use larger.Overlaps(smaller) (where larger and smaller are in terms of the size of the set) as I'd expect an implementation of Overlaps to iterate over the argument and check each element against contents of the set you call it on.


As mentioned , Applying Any() will give you some performance.

I tested it on pretty big dataset and it gave 25% improvements.

Also applying larger.Intersect(smaller) rather than the opposite is very important, in my case, it gave 35% improvements.

Also ordering the list before applying intersect gave another 7-8%.

Another thing to keep in mind that depending on the use case you can completely avoid applying intersect.

For example, for an integer list, if the maximum and minimum are not within the same bounders you don't need to apply intersect since they will never do.

The same goes for a string list with the same idea applied to first letter.

Again depending on your case, try as much as possible to find a rule where intersection is impossible to avoid calling it.