I'm running the code below and getting some non-USA 'dropoff_location', 'dropoff_lat', and 'dropoff_lon' for USA zip codes. All zip codes are in the New York City area so all 'dropoff_location', 'dropoff_lat', and 'dropoff_lon' should be in the New York City area. Am I doing something wrong here?
import geopandas
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="ryan_app")
#applying the rate limiter wrapper
from geopy.extra.rate_limiter import RateLimiter
geocode = RateLimiter(geolocator.geocode)
#Applying the method to pandas DataFrame
df['dropoff_location'] = df['dropoff_zip'].apply(geocode)
df['dropoff_lat'] = df['dropoff_location'].apply(lambda x: x.latitude if x else None)
df['dropoff_lon'] = df['dropoff_location'].apply(lambda x: x.longitude if x else None)
df.head()
Result:
pickup_datetime dropoff_datetime trip_distance fare_amount pickup_zip dropoff_zip time_of_trip dropoff_location dropoff_lat dropoff_lon
95 2016-02-02 14:00:28 2016-02-02 14:20:22 2.04 13.5 10001 10199 0 days 00:19:54 (Manhattan, New York County, City of New York,... 40.751528 -73.995849
96 2016-02-10 00:25:33 2016-02-10 00:30:09 1.03 5.5 10001 10011 0 days 00:04:36 (Manhattan, New York County, City of New York,... 40.740972 -73.999560
97 2016-02-19 09:19:18 2016-02-19 09:34:41 2.10 11.5 10002 10001 0 days 00:15:23 (Корольовський район, Житомир, Житомирська міс... 50.269960 28.702845
98 2016-02-12 21:14:59 2016-02-12 21:22:33 0.93 6.5 10011 10012 0 days 00:07:34 (Bechloul, Daïra Bechloul, Bouira, 10012, Algé... 36.312195 4.074957
99 2016-02-04 21:25:09 2016-02-04 21:35:38 1.70 9.0 10028 10065 0 days 00:10:29 (San Germano Chisone, Torino, Piemonte, 10065,... 44.894901 7.235602
You just need to limit search results to a specific country (or a list of countries) by putting the country_codes
argument in the geolocator.geocode
method. Your code would look like this below:
import geopandas
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="ryan_app")
df['dropoff_location'] = df['dropoff_zip'].apply(geolocator.geocode, country_codes="US", timeout=1)
df['dropoff_lat'] = df['dropoff_location'].apply(lambda x: x.latitude if x else None)
df['dropoff_lon'] = df['dropoff_location'].apply(lambda x: x.longitude if x else None)
print(df)
Output:
pickup_zip dropoff_zip dropoff_location dropoff_lat dropoff_lon
0 10001 10199 (Manhattan, New York County, City of New York,... 40.751528 -73.995849
1 10001 10011 (Manhattan, New York County, City of New York,... 40.740858 -73.999422
2 10002 10001 (Manhattan, New York County, City of New York,... 40.748399 -73.994036
3 10011 10012 (Manhattan, New York County, City of New York,... 40.725028 -73.998068
4 10028 10065 (Manhattan, New York County, City of New York,... 40.766035 -73.964690
You can also get a detailed address once you've extracted the latitude and longitude from the zipcodes. Another solution to get a more detailed address would be like this below,:
import numpy as np
import geopy
geolocator = geopy.geocoders.Nominatim(user_agent="ryan_app")
def reverse_geocoding(lat, lon):
try:
location = geolocator.reverse(geopy.point.Point(lat, lon))
return location.raw['display_name']
except:
return None
df['dropoff_location'] = df['dropoff_zip'].apply(geolocator.geocode, country_codes="US", timeout=1)
df['dropoff_lat'] = df['dropoff_location'].apply(lambda x: x.latitude if x else None)
df['dropoff_lon'] = df['dropoff_location'].apply(lambda x: x.longitude if x else None)
df['detailed_dropoff_address'] = np.vectorize(reverse_geocoding)(df['dropoff_lat'], df['dropoff_lon'])
print(df.head())
Output:
pickup_zip dropoff_zip dropoff_location dropoff_lat dropoff_lon detailed_dropoff_address
0 10001 10199 (Manhattan, New York County, City of New York,... 40.751528 -73.995849 Moynihan Train Hall, West 31st Street, Chelsea...
1 10001 10011 (Manhattan, New York County, City of New York,... 40.740858 -73.999422 224, West 17th Street, Chelsea District, Manha...
2 10002 10001 (Manhattan, New York County, City of New York,... 40.748399 -73.994036 227, West 29th Street, Chelsea, Manhattan, New...
3 10011 10012 (Manhattan, New York County, City of New York,... 40.725028 -73.998068 Self-Portrait, 158, Mercer Street, Manhattan C...
4 10028 10065 (Manhattan, New York County, City of New York,... 40.766035 -73.964690 Church of St. Vincent Ferrer, East 66th Street...