I want to figure out which state a lat long belongs to. And for this I am using the shape files provided by US Census and shapely library. This is what I tried so far:
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
df_poly = gpd.read_file("data/tl_2019_us_state.shp")
df_poly = df_poly[['GEOID', 'geometry']].set_index('GEOID')
display(df_poly.head(5))
geometry GEOID 54 POLYGON ((-81.74725 39.09538, -81.74635 39.096... 12 MULTIPOLYGON (((-86.38865 30.99418, -86.38385 ... 17 POLYGON ((-91.18529 40.63780, -91.17510 40.643... 27 POLYGON ((-96.78438 46.63050, -96.78434 46.630... 24 POLYGON ((-77.45881 39.22027, -77.45866 39.220...
p1 = Point(map(float, (29.65, -95.17)))
any(df_poly['geometry'].contains(p1))
False
But it is somehow returning False for any coordinate that I try. For example the above coordinate is from Texas but still its returning False, so what am I missing here?
Here are a few things you should check:
Did you use the correct order for the point? Shapely points use (x, y) coordinates, which are in the opposite order of (lat, lon) coordinates. I'd try flipping the coordinates and seeing if that works.
For example, I see one of your coordinates is this: "-81.74725 39.09538" If you interpret that in (lat, lon) order, it's in Antartica. If you interpret it in (x, y) order, it's in Ohio.
Are you using the correct SRID? The census data usually uses NAD83, but this is a good thing to check:
print(df_poly.crs)
Another good sanity check is to look at the centroid of each polygon, and verify that it's reasonable:
df.geometry.centroid
In the past, I've seen people who had data which was in the wrong SRID, and had to convert it.