Find minimum distance between points of two lists in Python
The easiest way is probably to use scipy.spatial.distance.cdist
:
import numpy as np
from scipy.spatial import distance
s1 = np.array([(0,0), (0,1), (1,0), (1,1)])
s2 = np.array([(3,2), (1,9)])
print(distance.cdist(s1,s2).min(axis=1))
# array([3.60555128, 3.16227766, 2.82842712, 2.23606798])
Some more speed might be gained by directly outputting 0
for any point from s1
that is also in s2
.
To calculate the N distances, there's not a better method than brute forcing all of the possibilities. If you wanted something higher level, like perhaps the greatest or smallest distance, you could reduce the number of calculations based on some external knowledge, but the given your setup, the best you're going to get is O(n^2) performance.
EDIT: Given your comment, there are methods, that involve the general "divide and conquer" approach. Wikipedia has a good overview, and I'll copy a perhaps relevant bit here:
The problem can be solved in O(n log n) time using the recursive divide and conquer approach, e.g., as follows:
- Sort points according to their x-coordinates.
- Split the set of points into two equal-sized subsets by a vertical line x = xmid.
- Solve the problem recursively in the left and right subsets. This yields the left-side and right-side minimum distances dLmin and dRmin, respectively.
- Find the minimal distance dLRmin among the set of pairs of points in which one point lies on the left of the dividing vertical and the other point lies to the right.
- The final answer is the minimum among dLmin, dRmin, and dLRmin.
Brute force is really the main way. You might be able to get squeeze some performance out using a KDTree since your data is low dimensional. scipy.spatial.KDTree
kdtree = scipy.spatial.KDTree(s2)
neighbours = kdtree.query(s1)
Have you tried using cdist
:
import numpy as np
from scipy.spatial.distance import cdist
np.min(cdist(s1,s2))
returns
array([ 3.60555128, 3.16227766, 2.82842712, 2.23606798])
You might also get a performance boost by replacing s1
and s2
by np.array
s, although scipy
might be doing that internally, I'm not sure.
If this isn't optimised enough, I think you can do this in O(ns2*log(ns2) + ns1) by finding the Voronoi diagram of the points in s2
and then looping through s1
to see which region the point falls in which will match with the closest point in s2
.