Reverse geocoding of Pandas DataFrame with Lat/Long columns

data-framegeopypandaspythonreverse-geocoding

I have a DataFrame with Lat/Long coordinates in separate series. I am trying to use GeoPy and Nominatim to get the reverse raw address. I keep getting errors such as:

TypeError: reverse() takes 2 positional arguments but 3 were given

or

needs coordinate pair or Point

Here is some sample data as a test DataFrame:

Id  Latitude    Longitude
1   30.197535   -97.662015
2   34.895699   -82.218903
3   33.6367 -84.428101
4   33.6367 -84.428101
5   32.733601   -117.19

and here is my code:

# summarize travelers by country using geopy geocoder

# initialize Nominatim API 
geolocator = Nominatim(user_agent="http")

#check the dtypes, they're float
test.dtypes

# combine lat/long into new column
test['geom'] = (test["Latitude"].map(str) + ',' + test['Longitude'].map(str)
check first record
test['geom'][0]
                
# alternative approach #1
test['geom'] = test['Latitude'].apply(str) + "," + test['Longitude'].apply(str)
test['geom'][0]

# reverse geocode
test['address'] = geolocator.reverse(test['Latitude'],test['Longitude']).raw

What am I doing wrong? I am new to this and it is infuriating. All examples I've found don't work or only show geocoding with one record.

Examples:

Best Answer

Here the Nominatim geocoder (free to choose) from the GeoPy geocoding Python library was used, for more details, please check the documentation.

Coordinates of point features should be passed as a pair (see this thread for more details), otherwise, you may get this error:

ValueError: Must be a coordinate pair or Point.

Therefore from geopy.point import Point was additionally imported.

The code below also uses the vectorize of NumPy.

import numpy as np
import pandas as pd
from geopy.geocoders import Nominatim
from geopy.point import Point

geolocator = Nominatim(user_agent="test")

def reverse_geocoding(lat, lon):
    try:
        location = geolocator.reverse(Point(lat, lon))
        return location.raw['display_name']
    except:
        return None

df = pd.DataFrame({'id': [1,2,3,4,5],
                   'Latitude': [30.197535, 34.895699, 33.636700, 33.636700, 32.733601],
                   'Longitude': [-97.662015, -82.218903, -84.428101, -84.428101, -117.190000]
                    })

df['address'] = np.vectorize(reverse_geocoding)(df['Latitude'], df['Longitude'])

print(df)

will result in:

   id   Latitude   Longitude                                            address
0   1  30.197535  -97.662015  Austin-Bergstrom International Airport, 3600, ...
1   2  34.895699  -82.218903  Greenville-Spartanburg International Airport, ...
2   3  33.636700  -84.428101  Hartsfield–Jackson Atlanta International Airpo...
3   4  33.636700  -84.428101  Hartsfield–Jackson Atlanta International Airpo...
4   5  32.733601 -117.190000  San Diego International Airport, North Harbor ...