I'm searching for python package that can help me get the country from the address.
I use pycountry but I could used only if I have the country in the address, but I don't know what to do if I have, for ex:
"Georgetown, TX" , "Santa Fe, New Mexico", "Nuremberg", "Haarbergstr. 67 D-99097 Erfurt".
I don't know what to do when I have no country in address, and no clear pattern.
Seems geopy can do it relatively easily. An example adopted from the documentation:
>>> import geopy
>>> from geopy.geocoders import Nominatim
>>> gl = Nominatim()
>>> l = gl.geocode("Georgetown, TX")
# now we have l = Location((30.671598, -97.6550065012, 0.0))
>>> l.address
[u'Georgetown', u' Williamson County', u' Texas', u' United States of America']
# split that address on commas into a list, and get the last item (i.e. the country)
>>> l.address.split(',')[-1]
u' United States of America'
We got it! Now, test it on other locations
>>> l = gl.geocode("Santa Fe, New Mexico")
l.address.split(',')[-1]
u' United States of America'
>>> l = gl.geocode("Nuremberg")
>>> l.address.split(',')[-1]
u' Deutschland'
>>> l = gl.geocode("Haarbergstr. 67 D-99097 Erfurt")
>>> l.address.split(',')[-1]
u' Europe'
So you could automate the list in a script:
import geopy
from geopy.geocoders import Nominatim
geolocator = Nominatim()
list_of_locations = "Georgetown, TX" , "Santa Fe, New Mexico", "Nuremberg", "Haarbergstr. 67 D-99097 Erfurt"
for loc in list_of_locations:
location = geolocator.geocode(loc)
fulladdress = location.address
country = fulladdress.split(',')[-1]
print '{loc}: {country}'.format(loc=loc, country=country)
Output:
Georgetown, TX: United States of America
Santa Fe, New Mexico: United States of America
Nuremberg: Deutschland
Haarbergstr. 67 D-99097 Erfurt: Europe
Hope this helps.