I am running a specific analysis where we use shapely to create buffers around points (store locations) and then check if other points (user locations) are present within that buffer value. When i check the distance using shapely, it turns out to be different from the distance I get from geopy. The haversine formula agrees with Geopy and a check on google maps using the measure distance function also gives around the same distance
Here is an example:
from shapely.geometry import Point, shape
from pyproj import Proj, transform
from geopy.distance import vincenty, great_circle
pt_store=Point(transform(Proj(init='EPSG:4326'),Proj(init='EPSG:3857'),-76.799614, 39.435307))
pt_user=Point(transform(Proj(init='EPSG:4326'),Proj(init='EPSG:3857'),-76.79989,39.43604))
vincenty((39.435307,-76.799614),(39.43604,-76.79989)).meters
great_circle((39.435307,-76.799614),(39.43604,-76.79989)).meters
pt_store.distance(pt_user)
Vincenty: 84.77847691521336
Great_circle: 84.90640111682812
Shapely: 110.02637304449682
Haversine formula (http://www.movable-type.co.uk/scripts/latlong.html): 84.88
Which one is right? Shapely or others?
Also, is such a big difference(~22%) expected? Or am I missing something?
Best Answer
Because the principles and the algorithms are different (look at Geographical distance)
Vincenty, Great Circle and Haversine use either the geodesic distance (on an ellipsoid, Vincenty) or the great-circle distance (the shortest distance along the surface of a sphere) between two points. The shortest distance on the surface of a sphere is along the great-circle which contains the two points.
Therefore it is normal that the Shapely, Numpy and Scipy euclidean distances differ from the Vincenty, Great Circle and Haversine distances and the differences between the Vincenty, Great Circles and Haversine distances are linked to the choice of an ellipsoid, and many other things.
You can also change the ellipsoid
Or use other libraries as geodistance
You can see that all the differences are centimetric. With metric precision, all the values = 85 meters.
Which one is right? All, because it depends on the context: if you work with projected data (cartesian plane), you use the Euclidean distance (Shapely, Numpy ,Scipy and many others), if not, one of the others.
They are also many other distances (Scipy Spatial distances)
New
In support of the answer of Mintx