I'm retroactively sanitizing a bunch of data for the Offer Drive product (http://offerletter.io/drive.html). I'm trying to normalize a freeform "location" field to determine if submitted locations fall in the United States (or not).
Values may vary in granularity, but all are "real", e.g.
San Francisco, CA
Milwaukee
Bangalore
My question is, is there a good way (some API or library) of normalizing these intelligently based on the user-submitted strings such that I can say:
normalized = GeoNormalize.normalize("San Francisco")
return normalized.country() == "United States"
I really like chronyk
( https://github.com/KoffeinFlummi/Chronyk ) and something like that for locations would be great.
There are many, usually provided by mapping or GIS vendors.
For example the Google geocoding service accepts a string and returns a ranked set of locations in a standard format:
https://developers.google.com/maps/documentation/geocoding/?csw=1#Geocoding
Yahoo has one too:
https://developer.yahoo.com/boss/geo/#overview
like I said, there are many, many. They are usually free for light usage, but will incur usage fees after a certain point.