Python Haversine – Compute Distance Between Coordinates and Print CSV

distancegpslatitude longitudepython

I have a .csv file with 3 columns. The first column is titled Distance and it is empty. The second and third columns are titled Longitude and Latitude which contain coordinates, for example:

Distance,Longitude,Latitude

,-77.60483,40.31117

,-77.58167,40.3045

,-77.52883,40.24917

,-77.508,40.14917

,-77.49617,40.13117

I have wrote a script that computes the distance between the two points using the haversine formula (see below), but I have to manually enter the coordinates, which is not desirable:

import math

lat1 = 40.31117
long1 = -77.60483
lat2 = 40.3045
long2 = -77.58167

def distance_on_unit_sphere(lat1, long1, lat2, long2):

# Converts lat & long to spherical coordinates in radians.
degrees_to_radians = math.pi/180.0

# phi = 90 - latitude
phi1 = (90.0 - lat1)*degrees_to_radians
phi2 = (90.0 - lat2)*degrees_to_radians

# theta = longitude
theta1 = long1*degrees_to_radians
theta2 = long2*degrees_to_radians

# Compute the spherical distance from spherical coordinates.
# For two locations in spherical coordinates:
# (1, theta, phi) and (1, theta', phi')cosine( arc length ) =
# sin phi sin phi' cos(theta-theta') + cos phi cos phi' distance = rho * arc    length

cos = (math.sin(phi1)*math.sin(phi2)*math.cos(theta1 - theta2) +
   math.cos(phi1)*math.cos(phi2))
arc = math.acos(cos)*6371 #radius of the earth in km

return arc

print distance_on_unit_sphere(lat1,long1,lat2,long2)

My goal is to populate the empty distance column using Python by computing the distance between the 1st location (-77.60483,40.31117) and the 2nd location (-77.58167,40.3045) and then print that value into the Distance column and then compute the distance between the 2nd (-77.58167,40.3045) and 3rd point (-77.52883,40.24917) and then print that value below the previously computed value in Distance column and so on for the subsequent coordinates. I need some help or an example provided to input the .csv file, run the haversine formula, and then write the result into the appropriate location in an output .csv.

Best Answer

If I've understood correctly, this code should do what you're after. Using your input

Distance,Longitude,Latitude
,-77.60483,40.31117
,-77.58167,40.3045
,-77.52883,40.24917
,-77.508,40.14917
,-77.49617,40.13117

I'm producing this output. I've assumed that you want the distances to go forward - i.e. that line 2 has its distance set to the distance to the next point (line 3).The last line is set to 0 as there is no following line to measure to

Distance,Longitude,Latitude
2.0992,-77.6048,40.3112
7.6122,-77.5817,40.3045
11.2593,-77.5288,40.2492
2.2399,-77.5080,40.1492
0.0000,-77.4962,40.1312

Keep your existing code, and append this snippet. The csv library makes it nice and easy to work with CSV files.

I'm also using a deque. This is a FIFO (first in, first out) data structure. I use this to keep a queue of points. Once this has two entries, I compute the distance between the two points, write it out, then discard the oldest point.

import csv
import collections
filein = open("test.csv","rU") # universal newline mode
fileout = open("output.csv","w")
fileout.write("Distance,Longitude,Latitude\n")
csvrdr = csv.reader(filein, delimiter=",")
linecount = 0
# use a deque. once this hits 2 entries, compute the distance
# and then remove the oldest entry
todo = collections.deque()
for line in csvrdr:
    if linecount > 0: # skip header
        _, longitude, latitude = line # _ means discard
        longitude = float(longitude)
        latitude = float(latitude)
        todo.append((longitude, latitude))
    if len(todo)==2:
        # got two points, compute distance and remove the oldest
        oldlon, oldlat = todo.popleft()
        distance = distance_on_unit_sphere(latitude, longitude, oldlat, oldlon)
        fileout.write("%2.4f,%2.4f,%2.4f\n" % (distance, oldlon, oldlat))
    linecount += 1
# got one entry in deque, spit that out with 0
assert(len(todo)==1)
oldlon, oldlat = todo.popleft()
fileout.write("%2.4f,%2.4f,%2.4f\n" % (0.0, oldlon, oldlat))
fileout.close()
filein.close()