Search code examples
pythongeopandas

How to convert MULTIPOLYGON xy geometry to latitude and longitude?


I am using the python module arcgis to download a shapefile, located here: https://www.arcgis.com/home/item.html?id=2d5c785555aa4b0b946f1aa61c56274f

I have managed to extract it into a pandas dataframe by following the documentation:

enter image description here

from arcgis import GIS
import pandas as pd
gis = GIS(verify_cert=False,api_key=your_key)

# search for file by name which is National_LHO
search_result = gis.content.search(query="title:National_LHO", item_type="Feature Layer")

# get layer
layer = search_result[0].layers[0]

# dataframe from layer
df= pd.DataFrame.spatial.from_layer(layer)

# check it out
print(df.head())


   FID              LHO   Shape__Area  Shape__Length  \
0    1  Carlow/Kilkenny  7.053876e+09   5.924032e+05   
1    2   Cavan/Monaghan  8.858580e+09   7.801971e+05   
2    3            Clare  9.055446e+09   8.005301e+05   
3    4          Donegal  1.467971e+10   2.135710e+06   
4    5     Dublin North  1.076876e+09   3.327819e+05   

                                               SHAPE  
0  {"rings": [[[-747212.35980769, 6967909.5066712...  
1  {"rings": [[[-781459.713316924, 7249668.124932...  
2  {"rings": [[[-1083308.07544972, 6918940.329570...  
3  {"rings": [[[-912809.697847961, 7265617.367554...  
4  {"rings": [[[-674539.041086896, 7057323.867987...  

I have then converted it to a geopandas dataframe:

import geopandas as gpd

gdf = gpd.GeoDataFrame(dc_df)


However the CRS info looks strange. It says the bounds are (-180.0, -90.0, 180.0, 90.0) but the geometry coordinates go much higher than that.

print(gdf.crs)

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
print(gdf.head())

   FID              LHO   Shape__Area  Shape__Length  \
0    1  Carlow/Kilkenny  7.053876e+09   5.924032e+05   
1    2   Cavan/Monaghan  8.858580e+09   7.801971e+05   
2    3            Clare  9.055446e+09   8.005301e+05   
3    4          Donegal  1.467971e+10   2.135710e+06   
4    5     Dublin North  1.076876e+09   3.327819e+05   

                                            geometry  
0  MULTIPOLYGON (((-747212.360 6967909.507, -7459...  
1  MULTIPOLYGON (((-781459.713 7249668.125, -7813...  
2  MULTIPOLYGON (((-1083308.075 6918940.330, -108...  
3  MULTIPOLYGON (((-912809.698 7265617.368, -9127...  
4  MULTIPOLYGON (((-674539.041 7057323.868, -6748...  

I have been told that simply using to_crs('EPSG:4326') will convert everything to latitude and longitude, but that doesn't work.

gdf.to_crs('EPSG:4326').head()

enter image description here

As you can see they are still in xy coordinates, and not long / lat.

I cannot figure this out. I just want longitude and latitude polygons.

Does any body know know to do this? I am losing my mind and have spent days on this. I am not a geopandas expert.

Perhaps some useful info. The data I downloaded has this spatial reference: Spatial Reference: 102100 (3857)

I tried to use 3857 while converting to a geodataframe, but got an error.

gdf = gpd.GeoDataFrame(dc_df,crs='3857')

FutureWarning: CRS mismatch between CRS of the passed geometries and 'crs'. Use 'GeoDataFrame.set_crs(crs, allow_override=True)' to overwrite CRS or 'GeoDataFrame.to_crs(crs)' to reproject geometries. CRS mismatch will raise an error in the future versions of GeoPandas.

Solution

  • It wasn’t converting because when you read the normal pandas dataframe into geopandas dataframe, the crs information was lost

    from arcgis import GIS
    import pandas as pd
    gis = GIS(verify_cert=False)
    
    # search for file by name which is National_LHO
    search_result = gis.content.search(query="title:National_LHO", item_type="Feature Layer")
    
    # get layer
    layer = search_result[0].layers[0]
    
    # dataframe from layer
    df= pd.DataFrame.spatial.from_layer(layer)
    
    # check it out
    # print(df.head())
    
    import geopandas as gpd
    
    gdf = gpd.GeoDataFrame(df)
    

    If you do gdf.crs it returns empty.

    So the crs information has to be set manually. Setting it to epsg 3857 as you have mentioned it

    gdf = gdf.set_geometry('SHAPE')
    gdf = gdf.set_crs(epsg='3857')
    

    and now you can do

    gdf.to_crs(epsg='4326').head()
    

    enter image description here