from list of integers, get number closest to a given value
If we are not sure that the list is sorted, we could use the built-in min()
function, to find the element which has the minimum distance from the specified number.
>>> min(myList, key=lambda x:abs(x-myNumber))
4
Note that it also works with dicts with int keys, like {1: "a", 2: "b"}
. This method takes O(n) time.
If the list is already sorted, or you could pay the price of sorting the array once only, use the bisection method illustrated in @Lauritz's answer which only takes O(log n) time (note however checking if a list is already sorted is O(n) and sorting is O(n log n).)
I'll rename the function take_closest
to conform with PEP8 naming conventions.
If you mean quick-to-execute as opposed to quick-to-write, min
should not be your weapon of choice, except in one very narrow use case. The min
solution needs to examine every number in the list and do a calculation for each number. Using bisect.bisect_left
instead is almost always faster.
The "almost" comes from the fact that bisect_left
requires the list to be sorted to work. Hopefully, your use case is such that you can sort the list once and then leave it alone. Even if not, as long as you don't need to sort before every time you call take_closest
, the bisect
module will likely come out on top. If you're in doubt, try both and look at the real-world difference.
from bisect import bisect_left
def take_closest(myList, myNumber):
"""
Assumes myList is sorted. Returns closest value to myNumber.
If two numbers are equally close, return the smallest number.
"""
pos = bisect_left(myList, myNumber)
if pos == 0:
return myList[0]
if pos == len(myList):
return myList[-1]
before = myList[pos - 1]
after = myList[pos]
if after - myNumber < myNumber - before:
return after
else:
return before
Bisect works by repeatedly halving a list and finding out which half myNumber
has to be in by looking at the middle value. This means it has a running time of O(log n) as opposed to the O(n) running time of the highest voted answer. If we compare the two methods and supply both with a sorted myList
, these are the results:
$ python -m timeit -s " from closest import take_closest from random import randint a = range(-1000, 1000, 10)" "take_closest(a, randint(-1100, 1100))" 100000 loops, best of 3: 2.22 usec per loop $ python -m timeit -s " from closest import with_min from random import randint a = range(-1000, 1000, 10)" "with_min(a, randint(-1100, 1100))" 10000 loops, best of 3: 43.9 usec per loop
So in this particular test, bisect
is almost 20 times faster. For longer lists, the difference will be greater.
What if we level the playing field by removing the precondition that myList
must be sorted? Let's say we sort a copy of the list every time take_closest
is called, while leaving the min
solution unaltered. Using the 200-item list in the above test, the bisect
solution is still the fastest, though only by about 30%.
This is a strange result, considering that the sorting step is O(n log(n))! The only reason min
is still losing is that the sorting is done in highly optimalized c code, while min
has to plod along calling a lambda function for every item. As myList
grows in size, the min
solution will eventually be faster. Note that we had to stack everything in its favour for the min
solution to win.