I am trying to save a shape file locally with GeoPandas, preferably as a zipped file, however I have tried both compressed and uncompressed methods. I'm noticing that after saving the file locally, then reading the file back in, three columns have changed, most importantly 'geom' has reverted back to 'geometry', 'parcel_apn_2' is now 'parcel_a_1', and 'fips_county' is now 'fips_count'. Am I missing something that would cause this behavior?
Checking the column names prior to saving:
# shp_prior_to_writing is the original GeoDataFrame
shp_prior_to_writing.columns
returns...
Index(['xref_id', 'fips_state', 'fips_county', 'county', 'parcel_apn',
'parcel_apn_2', 'address', 'city', 'state', 'zip', 'src_id', 'latitude',
'longitude', 'geom'],
dtype='object')
then writing the same file locally...
shp_prior_to_writing.to_file('test_shp.shp', driver='ESRI Shapefile')
and reading it back in...
same_shape_file=gpd.read_file('test_shp.shp')
same_shape_file.columns
returns...
Index(['xref_id', 'fips_state', 'fips_count', 'county', 'parcel_apn',
'parcel_a_1', 'address', 'city', 'state', 'zip', 'src_id', 'latitude',
'longitude', 'geometry'],
dtype='object')
I've tried zipping vs. uncompressed. I've tried without explicitly setting any drivers (I believe it defaults to ESRI Shapefile anyways), I've tried restarting the Jupyter kernel in my notebook. I've tried explicitly renaming those columns again prior to saving as well, but the result appears to always be the same.
The shapefile format has a hard limit on column names of 10 characters. That limit is baked into the format specification and comes from ESRI, and is not the fault of geopandas or Fiona which provides the shp driver.
See e.g. a discussion of the ESRI Shapefile standard’s limitations on Wikipedia, which lists the 10-character limit. Also see GIS StackExchange: Bypassing 10 character limit of field name in shapefiles? for a discussion of options.
Because of this 10-character limit, geopandas must rename your columns before they can be rewritten, which is resulting in the name change you are seeing. If you want to continue to use these column names and have them round trip to disk, you will need to use a different file format.