Yes, you will get these kinds of errors with a global Mercator projection: it is accurate at the equator and the distortion increases exponentially with latitude away from the equator. The distance distortion is exactly 2 (100%) at 60 degrees latitude. At your test latitudes (64.14 degrees) I compute a distortion of 2.294, exactly agreeing with the ratio 904/394 = 2.294. (Earlier I computed 2.301 but that was based on a sphere, not the WGS84 ellipsoid. The difference (of 0.3%) gives us a sense of the accuracy you might gain from using an ellipsoid-based projection versus the sphere-based Haversine formula.)
There is no such thing as a global projection that yields highly accurate distances everywhere. That's one reason the UTM zone system is used!
One solution is to use spherical geometry for all your calculations, but you have rejected that (which is reasonable if you're going to be doing complex operations, but the decision might be worth revisiting).
Another solution is to adapt the projection to the points being compared. For example, you could safely use a transverse Mercator (as in the UTM system) with a meridian lying near the center of the region of interest. Moving the meridian is a simple thing to do: just subtract the meridian's longitude from all the longitudes and use a single TM projection centered at the Prime Meridian (with a scale factor of 1, rather than the 0.9996 of the UTM system). For your work this will tend to be more accurate than using UTM itself. It will give correct angles (TM is conformal) and will be remarkably accurate for points separated by only a few tens of kilometers: expect better than six-digit accuracy. In fact, I would be inclined to attribute any small differences between these adapted-TM distances and the Haversine distances to the difference between the ellipsoid (used for the TM projection) and the sphere (used by Haversine), rather than to distortion in the projection.
Here is a paper that may help in beginning to drive your selection of distance measures. Take note of table 1 (pg. 4), copied below.
On geodesic distance modeling and spatial analysis (2004) - S. Banerjee
I would suggest that if you intend to use inter-UTM zone distance computations you should be using a geographic measure. Likewise, the spatial distribution of the points to roads within the UTM may be sufficient in the N/S extent to warrant the use of geographic distance measures.
The real question needs to start as: How accurate do my measures need to be? How many measures will I be making and is the added computational cost of a geographic measure inline with the required solution speed?
Edit for the comment:
The answer goes back to your accuracy tolerance. If I needed to compute in planar space over a large distance (3 UTM zones at mid latitudes is sufficiently large) with a high level of accuracy I would likely use a sinusoidal projection. The distances computed using a gnomonic projection are only completely accurate 'from a single reference point' (ref. as above). Are you only measuring from a single point in each UTM zone? If so, use the gnomonic projection. Otherwise, think about computing chordal distance, using a sinusoidal projection, or accepting the accuracy issues.
Edit for the additional comments above:
Given the accuracy requirement without any constraint on potential distance measures you really should be using geodesic measurements. Additionally, the gnomonic projection is not azimuthal equidistant, it just happens to draw the great circle curves as straight lines. As an alternative to geodesic computation you could reproject your data centered on the origin point of your measurement into an azimuthal equidistant projection*.
Having done this for a project involving 20,000+ points and some buffering, it is not efficient to perform for extremely fast lookup. It is a one time, let it run for a minute or so operation.
Best Answer
First, let's confirm what you already seem to know: If there are two different UTM zones, there are effectively two different coordinate reference systems (CRS), and distances between points across the zones cannot be calculated.
So, you must do one of these first:
There are many questions on this site about coordinate conversion or reprojection.
Either way, the problem then becomes how to accurately calculate distance given two coordinated points. Again, there are many questions on this site about distance calculation.
Briefly then, you can do one of these:
The easiest solution is probably to use one of the UTM zones, extended to include the outer point.