Efficiently find nearest dictionary key

So, this doesn't directly answer your question, because you specifically asked for something built-in to the .NET framework, but facing a similar problem, I found the following solution to work best, and I wanted to post it here for other searchers.

I used the TreeDictionary<K, V> from the C5 Collections (GitHub/NuGet), which is an implementation of a red-black tree.

It has Predecessor/TryPredecessor and WeakPredessor/TryWeakPredecessor methods (as well as similar methods for successors) to easily find the nearest items to a key.

More useful in your case, I think, is the RangeFrom/RangeTo/RangeFromTo methods that allow you to retrieve a range of key-value-pairs between keys.

Note that all of these methods can also be applied to the TreeDictionary<K, V>.Keys collection, which allow you to work with only the keys as well.

It really is a very neat implementation, and something like it deserves to be in the BCL.


It is not possible to efficiently find the nearest key with SortedList, SortedDictionary or any other "built-in" .NET type, if you need to interleave queries with inserts (unless your data arrives pre-sorted, or the collection is always small).

As I mentioned on the other question you referenced, I created three data structures related to B+ trees that provide find-nearest-key functionality for any sortable data type: BList<T>, BDictionary<K,V> and BMultiMap<K,V>. Each of these data structures provide FindLowerBound() and FindUpperBound() methods that work like C++'s lower_bound and upper_bound.

These are available in the Loyc.Collections NuGet package, and BDictionary typically uses about 44% less memory than SortedDictionary.


Since SortedDictionary is sorted on the key, you can create a sorted list of keys with

var keys = new List<DateTime>(dictionary.Keys);

and then efficiently perform binary search on it:

var index = keys.BinarySearch(key);

As the documentation says, if index is positive or zero then the key exists; if it is negative, then ~index is the index where key would be found at if it existed. Therefore the index of the "immediately smaller" existing key is ~index - 1. Make sure you handle correctly the edge case where key is smaller than any of the existing keys and ~index - 1 == -1.

Of course the above approach really only makes sense if keys is built up once and then queried repeatedly; since it involves iterating over the whole sequence of keys and doing a binary search on top of that there's no point in trying this if you are only going to search once. In that case even naive iteration would be better.

Update

As digEmAll correctly points out, you could also switch to SortedList<DateTime, decimal> so that the Keys collection implements IList<T> (which SortedDictionary.Keys does not). That interface provides enough functionality to perform a binary search on it manually, so you could take e.g. this code and make it an extension method on IList<T>.

You should also keep in mind that SortedList performs worse than SortedDictionary during construction if the items are not inserted in already-sorted order, although in this particular case it is highly likely that dates are inserted in chronological (sorted) order which would be perfect.