Yes, you will get these kinds of errors with a global Mercator projection: it is accurate at the equator and the distortion increases exponentially with latitude away from the equator. The distance distortion is exactly 2 (100%) at 60 degrees latitude. At your test latitudes (64.14 degrees) I compute a distortion of 2.294, exactly agreeing with the ratio 904/394 = 2.294. (Earlier I computed 2.301 but that was based on a sphere, not the WGS84 ellipsoid. The difference (of 0.3%) gives us a sense of the accuracy you might gain from using an ellipsoid-based projection versus the sphere-based Haversine formula.)
There is no such thing as a global projection that yields highly accurate distances everywhere. That's one reason the UTM zone system is used!
One solution is to use spherical geometry for all your calculations, but you have rejected that (which is reasonable if you're going to be doing complex operations, but the decision might be worth revisiting).
Another solution is to adapt the projection to the points being compared. For example, you could safely use a transverse Mercator (as in the UTM system) with a meridian lying near the center of the region of interest. Moving the meridian is a simple thing to do: just subtract the meridian's longitude from all the longitudes and use a single TM projection centered at the Prime Meridian (with a scale factor of 1, rather than the 0.9996 of the UTM system). For your work this will tend to be more accurate than using UTM itself. It will give correct angles (TM is conformal) and will be remarkably accurate for points separated by only a few tens of kilometers: expect better than six-digit accuracy. In fact, I would be inclined to attribute any small differences between these adapted-TM distances and the Haversine distances to the difference between the ellipsoid (used for the TM projection) and the sphere (used by Haversine), rather than to distortion in the projection.
Because the principles and the algorithms are different (look at Geographical distance)
- Shapely use the euclidean distance in a cartesian plane and the shortest distance between two points in a plane is a straight line which contains the two points.
import numpy as np
print np.linalg.norm(np.array(pt_user) - np.array(pt_store))
110.02637304449682 # meters
from scipy.spatial import distance
print distance.euclidean(pt_user, pt_store)
110.02637304449682 # meters
Vincenty, Great Circle and Haversine use either the geodesic distance (on an ellipsoid, Vincenty) or the great-circle distance (the shortest distance along the surface of a sphere) between two points. The shortest distance on the surface of a sphere is along the great-circle which contains the two points.
Therefore it is normal that the Shapely, Numpy and Scipy euclidean distances differ from the Vincenty, Great Circle and Haversine distances and the differences between the Vincenty, Great Circles and Haversine distances are linked to the choice of an ellipsoid, and many other things.
You can also change the ellipsoid
print vincenty((39.435307,-76.799614),(39.43604,-76.79989),ellipsoid='WGS-84')
0.0847784769149 km
print vincenty((39.435307,-76.799614),(39.43604,-76.79989),ellipsoid='GRS-80')
0.0847784769128 km
Or use other libraries as geodistance
print geodistance.distanceVincenty(39.435307,-76.799614,39.43604,-76.79989, ellipsoid='WGS-84')
(0.08477847691523362, -16.276730447136675) # distance, azimuth
print geodistance.distanceHaversine(39.435307,-76.799614,39.43604,-76.79989)
(0.08488248586585143, -16.214988211007256)
You can see that all the differences are centimetric. With metric precision, all the values = 85 meters.
Which one is right? All, because it depends on the context: if you work with projected data (cartesian plane), you use the Euclidean distance (Shapely, Numpy ,Scipy and many others), if not, one of the others.
They are also many other distances (Scipy Spatial distances)
New
In support of the answer of Mintx
pt_store=Point(transform(Proj(init='EPSG:4326'),Proj(init='EPSG:31370'),-76.799614, 39.435307))
pt_user=Point(transform(Proj(init='EPSG:4326'),Proj(init='EPSG:31370'),-76.79989,39.43604))
pt_store.distance(pt_user)
86.26511001003892
Best Answer
The problem is indicated by the word "well-conditioned." It's an issue of computer arithmetic, not mathematics.
Here are the basic facts to consider:
One radian on the earth spans almost 10^7 meters.
The cosine function for arguments x near 0 is approximately equal to 1 - x^2/2.
Double-precision floating point has about 15 decimal digits of precision.
Points (2) and (3) imply that when x is around one meter, or 10^-7 radians (point 1), almost all precision is lost: 1 - (10^-7)^2 = 1 - 10^-14 is a calculation in which the first 14 of the 15 significant digits all cancel, leaving just one digit to represent the result. Flipping this around (which is what the inverse cosine, "acos", does) means that computing acos for angles that correspond to meter-length distances cannot be done with any meaningful accuracy. (In certain bad cases the loss of precision gives a value where acos is not even defined, so the code will break down and give no answer, a nonsense answer, or crash the machine.) Similar considerations suggest you should avoid using the inverse cosine if distances less than a few hundred meters are involved, depending on how much precision you're willing to lose.
The role played by acos in the naive law-of-cosines formula is to convert an angle to a distance. That role is played by atan2 in the haversine formula. The tangent of a small angle x is approximately equal to x itself. Consequently the inverse tangent of a number, being approximately that number, is computed essentially with no loss in precision. This is why the haversine formula, although mathematically equivalent to the law of cosines formula, is far superior for small distances (on the order of 1 meter or less).
Here is a comparison of the two formulas using 100 random point-pairs on the globe (using Mathematica's double-precision calculations).
You can see that for distances less than about 0.5 meters, the two formulas diverge. Above 0.5 meters they tend to agree. To show how closely they agree, the next plot shows the ratios of the law of cosines:haversine results for another 100 random point pairs, with their latitudes and longitudes randomly differing by up to 5 meters.
This shows that the law of cosines formula is good to 3-4 decimal places once the distance exceeds 5-10 meters. The number of decimal places of accuracy increases quadratically; thus at 50-100 meters (one order of magnitude) you get 5-6 dp accuracy (two orders of magnitude); at 500-1000 meters you get 7-8 dp, etc.