I am using the python module arcgis
to download a shapefile, located here: https://www.arcgis.com/home/item.html?id=2d5c785555aa4b0b946f1aa61c56274f
I have managed to extract it into a pandas dataframe by following the documentation:
from arcgis import GIS
import pandas as pd
gis = GIS(verify_cert=False,api_key=your_key)
# search for file by name which is National_LHO
search_result = gis.content.search(query="title:National_LHO", item_type="Feature Layer")
# get layer
layer = search_result[0].layers[0]
# dataframe from layer
df= pd.DataFrame.spatial.from_layer(layer)
# check it out
print(df.head())
FID LHO Shape__Area Shape__Length \
0 1 Carlow/Kilkenny 7.053876e+09 5.924032e+05
1 2 Cavan/Monaghan 8.858580e+09 7.801971e+05
2 3 Clare 9.055446e+09 8.005301e+05
3 4 Donegal 1.467971e+10 2.135710e+06
4 5 Dublin North 1.076876e+09 3.327819e+05
SHAPE
0 {"rings": [[[-747212.35980769, 6967909.5066712...
1 {"rings": [[[-781459.713316924, 7249668.124932...
2 {"rings": [[[-1083308.07544972, 6918940.329570...
3 {"rings": [[[-912809.697847961, 7265617.367554...
4 {"rings": [[[-674539.041086896, 7057323.867987...
I have then converted it to a geopandas dataframe:
import geopandas as gpd
gdf = gpd.GeoDataFrame(dc_df)
However the CRS info looks strange. It says the bounds are (-180.0, -90.0, 180.0, 90.0)
but the geometry coordinates go much higher than that.
print(gdf.crs)
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
print(gdf.head())
FID LHO Shape__Area Shape__Length \
0 1 Carlow/Kilkenny 7.053876e+09 5.924032e+05
1 2 Cavan/Monaghan 8.858580e+09 7.801971e+05
2 3 Clare 9.055446e+09 8.005301e+05
3 4 Donegal 1.467971e+10 2.135710e+06
4 5 Dublin North 1.076876e+09 3.327819e+05
geometry
0 MULTIPOLYGON (((-747212.360 6967909.507, -7459...
1 MULTIPOLYGON (((-781459.713 7249668.125, -7813...
2 MULTIPOLYGON (((-1083308.075 6918940.330, -108...
3 MULTIPOLYGON (((-912809.698 7265617.368, -9127...
4 MULTIPOLYGON (((-674539.041 7057323.868, -6748...
I have been told that simply using to_crs('EPSG:4326')
will convert everything to latitude and longitude, but that doesn't work.
gdf.to_crs('EPSG:4326').head()
As you can see they are still in xy coordinates, and not long / lat.
I cannot figure this out. I just want longitude and latitude polygons.
Does any body know know to do this? I am losing my mind and have spent days on this. I am not a geopandas expert.
Perhaps some useful info. The data I downloaded has this spatial reference: Spatial Reference: 102100 (3857)
I tried to use 3857
while converting to a geodataframe, but got an error.
gdf = gpd.GeoDataFrame(dc_df,crs='3857')
FutureWarning: CRS mismatch between CRS of the passed geometries and 'crs'. Use 'GeoDataFrame.set_crs(crs, allow_override=True)' to overwrite CRS or 'GeoDataFrame.to_crs(crs)' to reproject geometries. CRS mismatch will raise an error in the future versions of GeoPandas.
It wasn’t converting because when you read the normal pandas dataframe into geopandas dataframe, the crs information was lost
from arcgis import GIS
import pandas as pd
gis = GIS(verify_cert=False)
# search for file by name which is National_LHO
search_result = gis.content.search(query="title:National_LHO", item_type="Feature Layer")
# get layer
layer = search_result[0].layers[0]
# dataframe from layer
df= pd.DataFrame.spatial.from_layer(layer)
# check it out
# print(df.head())
import geopandas as gpd
gdf = gpd.GeoDataFrame(df)
If you do gdf.crs
it returns empty.
So the crs information has to be set manually. Setting it to epsg 3857 as you have mentioned it
gdf = gdf.set_geometry('SHAPE')
gdf = gdf.set_crs(epsg='3857')
and now you can do
gdf.to_crs(epsg='4326').head()