Difference between Vincenty and great-circle distance calculations?
According to Wikipedia, Vincenty's formula is slower but more accurate:
Vincenty's formulae are two related iterative methods used in geodesy to calculate the distance between two points on the surface of a spheroid, developed by Thaddeus Vincenty (1975a) They are based on the assumption that the figure of the Earth is an oblate spheroid, and hence are more accurate than methods such as great-circle distance which assume a spherical Earth.
The accuracy difference is ~0.17%
in a 428 meters distance in Israel. I've made a quick-and-dirty speed test:
<class 'geopy.distance.vincenty'> : Total 0:00:04.125913, (0:00:00.000041 per calculation)
<class 'geopy.distance.great_circle'> : Total 0:00:02.467479, (0:00:00.000024 per calculation)
Code:
import datetime
from geopy.distance import great_circle
from geopy.distance import vincenty
p1 = (31.8300167,35.0662833)
p2 = (31.83,35.0708167)
NUM_TESTS = 100000
for strategy in vincenty, great_circle:
before = datetime.datetime.now()
for i in range(NUM_TESTS):
d=strategy(p1, p2).meters
after = datetime.datetime.now()
duration = after-before
print "%-40s: Total %s, (%s per calculation)" % (strategy, duration, duration/NUM_TESTS)
To conclude: Vincenty's formula is doubles the calculation time compared to great-circle, and its accuracy gain at the point tested is ~0.17%.
Since the calculation time is negligible, Vincenty's formula is preferred for every practical need.
Update: Following the insightful comments by whuber and cffk's and cffk's answer, I agree that the accuracy gain should be compared with the error, not the measurement. Hence, Vincenty's formula is a few orders of magnitude more accurate, not ~0.17%.
If you're using geopy, then the great_circle and vincenty distances are equally convenient to obtain. In this case, you should almost always use the one that gives you the more accurate result, i.e., vincenty. The two considerations (as you point out) are speed and accuracy.
Vincenty is two times slower. But probably in a real application the increased running time is negligible. Even if your application called for a million distance calculations, we are only talking about a difference in times of a couple of seconds.
For the points you use, the error in vincenty is 6 μm and the error in the great circle distance is 0.75 m. I would then say that vincenty is 120000 times more accurate (rather than 0.17% more accurate). For general points, the error in the great circle distance can be as much as 0.5%. So can you live with a 0.5% error in distances? For casual use (what's the distance from Cape Town to Cairo?), probably you can. However, many GIS applications have much stricter accuracy requirements. (0.5% is 5m over 1km. That really does make a difference.)
Nearly all serious mapping work is carried out on the reference ellipsoid and it therefore makes sense that distances should be measured on the ellipsoid too. Maybe you can get away with great-circle distances today. But for each new application, you will have to check whether this is still acceptable. Better is just to use the ellipsoidal distance from the start. You'll sleep better at night.
ADDENDUM (May 2017)
In reply to the answer given by @craig-hicks. The vincenty() method in geopy does have a potentially fatal flaw: it throws an error for nearly antipodal points. The documentation in the code suggests increasing the number of iterations. But this is not a general solution because the iterative method used by vincenty() is unstable for such points (each iteration takes you further from the correct solution).
Why do I characterize the problem as "potentially fatal"? Because any use of the distance function within another software library needs to be able to handle the exception. Handling it by returning a NaN or the great-circle distance may not be satisfactory, because the resulting distance function will not obey the triangle inequality which precludes its use, e.g., in vantage-point trees.
The situation isn't completely bleak. My python package geographiclib computes the geodesic distance accurately without any failures. The geopy pull request #144 changes the geopy's distance function to use geographiclib package if it's available. Unfortunately this pull request has been in limbo since Augest 2016.
ADDENDUM (May 2018)
geopy 1.13.0 now uses the geographiclib package for computing distances. Here's a sample call (based on the example in the original question):
>>> from geopy.distance import great_circle
>>> from geopy.distance import geodesic
>>> p1 = (31.8300167,35.0662833) # (lat, lon) - https://goo.gl/maps/TQwDd
>>> p2 = (31.8300000,35.0708167) # (lat, lon) - https://goo.gl/maps/lHrrg
>>> geodesic(p1, p2).meters
429.1676644986777
>>> great_circle(p1, p2).meters
428.28877358686776
My apologies for posting a second answer here, but I taking the opportunity to respond to the request by @craig-hicks to provide accuracy and timing comparisons for various algorithms for computing the geodesic distance. This paraphrases a comment I make to my pull request #144 for geopy which allows the use of one of two implementations of my algorithm for geodesics to be used within geopy, one is a native python implementation, geodesic(geographiclib), and the other uses an implementation in C, geodesic(pyproj).
Here is some timing data. Times are in microsecs per call
method dist dest
geopy great_circle 20.4 17.1
geopy vincenty 40.3 30.4
geopy geodesic(pyproj) 37.1 31.1
geopy geodesic(geographiclib) 302.9 124.1
Here is the accuracy of the geodesic calculations based on my Geodesic Test Set. The errors are given in units of microns (1e-6 m)
method distance destination
geopy vincenty 205.629 141.945
geopy geodesic(pyproj) 0.007 0.013
geopy geodesic(geographiclib) 0.011 0.010
I've include hannosche's pull request #194 which fixes a bad bug in the destination function. Without this fix, the error in the destination calculation for vincenty is 8.98 meters.
19.2% of the tests cases failed with vincenty.distance (iterations = 20). However the test set is skewed towards cases which would case this failure.
With random points on the WGS84 ellipsoid, the Vincenty algorithm is guaranteed to fail 16.6 out of 1000000 times (the correct solution is an unstable fixed point of the Vincenty method).
With the geopy implementation of Vincenty and iterations = 20, the failure rate is 82.8 per 1000000. With iterations = 200, the failure rate is 21.2 per 1000000.
Even though these rates are small, failures can be quite common. For example in a dataset of 1000 random points (think the worlds airports, perhaps), computing the full distance matrix would fail on average 16 times (with iterations = 20).