I am using the codes below to identify US county. The data is taken from Yelp which provides lat/lon coordinate.
id | latitude | longitude |
---|---|---|
1 | 40.017544 | -105.283348 |
2 | 45.588906 | -122.593331 |
import pandas
df = pandas.read_json("/Users/yelp/yelp_academic_dataset_business.json", lines=True, encoding='utf-8')
# Identify county
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="http")
df['county'] = geolocator.reverse(df['latitude'],df['longitude'])
The error was "TypeError: reverse() takes 2 positional arguments but 3 were given".
Nominatim.reverse
takes coordinate pairs; the issue is that you are passing it pandas dataframe columns. df['latitude']
here refers to the entire column in your data, not just one value, and since geopy
is independent of pandas
, it doesn't support processing an entire column and instead just sees that the input isn't a valid number.
Instead, try looping through the rows:
county = []
for row in range(len(df)):
county.append(geolocator.reverse((df['latitude'][row], df['longitude'][row])))
(Note the double brackets.)
Then, insert the column into the dataframe:
df.insert(index, 'county', county, True)
(index
should be what column position you want, and the boolean value at the end indicates that duplicate values are allowed.)