How to find the nearest neighbors for latitude and longitude point on python?
I honestly don't know if using a kd-tree would work correctly, but my hunch says it would be inaccurate.
I think you need to use something like greater circle distance to get accurate distances.
from math import radians, cos, sin, asin, sqrt, degrees, atan2
def validate_point(p):
lat, lon = p
assert -90 <= lat <= 90, "bad latitude"
assert -180 <= lon <= 180, "bad longitude"
# original formula from http://www.movable-type.co.uk/scripts/latlong.html
def distance_haversine(p1, p2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
Haversine
formula:
a = sin²(Δφ/2) + cos φ1 ⋅ cos φ2 ⋅ sin²(Δλ/2)
_ ____
c = 2 ⋅ atan2( √a, √(1−a) )
d = R ⋅ c
where φ is latitude, λ is longitude, R is earth’s radius (mean radius = 6,371km);
note that angles need to be in radians to pass to trig functions!
"""
lat1, lon1 = p1
lat2, lon2 = p2
for p in [p1, p2]:
validate_point(p)
R = 6371 # km - earths's radius
# convert decimal degrees to radians
lat1, lon1, lat2, lon2 = map(radians, [lat1, lon1, lat2, lon2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a)) # 2 * atan2(sqrt(a), sqrt(1-a))
d = R * c
return d
scikit-learn
provides a BallTree
class that supports the Haversine metric. See also this SO question.
I think you are trying to solve the k Nearest Neighbor problem.
Since your data set lies in 2D, then a kd-tree would work nicely, in general, I do not know about spicy.
However, if your points start living in a higher dimension, then kd-tree will not be a smart choice.