Search code examples
pythongeopandasarea

How can I convert geopandas crs units to meters^2?


I'm using geopandas, and I'm trying to get the areas of the shapefile I uploaded, but I'm unsure of what the unit of measurement is for the result of the .area attribute. Here is my code:

water = gp.read_file("/Users/user/Downloads/tlgdb_2020_a_us_areawater.gdb")
lakeSup = water.loc[water['FULLNAME'].str.contains('Lk Superior', na=False)]
lakeSup = lakeSup.to_crs(epsg=4269)
lakeSup['area'] = lakeSup.area
print(lakeSup.head(10))

Which outputs

       ANSICODE  ...      area
233938     None  ...  0.000821
629973     None  ...  0.215539
629974     None  ...  0.043184
629975     None  ...  0.358674
629976     None  ...  0.533665
629977     None  ...  0.035854
629978     None  ...  0.054233
629979     None  ...  0.737469
629980     None  ...  0.101494
629981     None  ...  0.035499

According to geopandas' docs, the .area attribute "Returns a Series containing the area of each geometry in the GeoSeries expressed in the units of the CRS." I'm very confused as to what this means, and how I can use this information to simply get a defined area for the geometries of my shapefile (e.g. meters^2). Thank you!


Solution

  • The most important part of area computation by geopandas is that you must use the CRS that corresponds to an equal-area projection. In this case, the simple cylindrical equal-area projection (epsg=6933) is good to use.

    The data files can be obtained from:- data_source

    The relevant code should becomes:

    water = gp.read_file("/Users/user/Downloads/tlgdb_2020_a_us_areawater.gdb")
    
    # check the CRS
    water.crs
    # you should find that it is already `epsg=4269`
    
    # select some rows of data
    lakeSup = water.loc[water['FULLNAME'].str.contains('Lk Superior', na=False)]
    
    # lakeSup = lakeSup.to_crs(epsg=4269) # already is
    
    # convert CRS to equal-area projection
    # the length unit is now `meter`
    eqArea_lakeSup = lakeSup.to_crs(epsg=6933)
    
    # compute areas in sq meters
    areas = eqArea_lakeSup.area
    
    # set the area units to sq Km.
    # and add it as a new column to geodataframe
    eqArea_lakeSup["area_sqKm"] = areas.values/10e6
    
    # print some result
    print(eqArea_lakeSup[["FULLNAME","area_sqKm"]].head(10))
    

    Output:

               FULLNAME   area_sqKm
    233938  Lk Superior    0.695790
    629973  Lk Superior  181.784389
    629974  Lk Superior   36.563379
    629975  Lk Superior  303.028792
    629976  Lk Superior  446.529784
    629977  Lk Superior   30.418959
    629978  Lk Superior   45.785750
    629979  Lk Superior  621.004141
    629980  Lk Superior   86.142490
    629981  Lk Superior   30.186412