Search code examples
pythonzipshapefilegeopandas

Exporting a Geopandas dataframe to a zipped shapefile directly


I'm trying to save Geopandas data frame into a shapefile that is written to a zipped folder directly.

As any shapefile user knows, a shapefile is not a single file but rather a collection of files that are meant to be read together. So calling myGDF.to_file(filename='myshapefile.shp', driver='ESRI Shapefile') creates not only myshapefile.shp but also myshapefile.prj, myshapefile.dbf, myshapefile.shx and myshapefile.cpg. This is probably why I am struggling to get the syntax right here.

Consider for instance a dummy Geopandas Dataframe like:

import pandas as pd
import geopandas as gpd
from shapely.geometry import Point

data = pd.DataFrame({'name': ['a', 'b', 'c'],
    'property': ['foo', 'bar', 'foo'],
        'x': [173994.1578792833, 173974.1578792833, 173910.1578792833],
        'y': [444135.6032947102, 444186.6032947102, 444111.6032947102]})
geometry = [Point(xy) for xy in zip(data['x'], data['y'])]
myGDF = gpd.GeoDataFrame(data, geometry=geometry)

I saw people using gzip, so I tried:

import geopandas as gpd
myGDF.to_file(filename='myshapefile.shp.gz', driver='ESRI Shapefile',compression='gzip')

But it did not work.

Then I tried the following (in a Google Colab environment):

import zipfile
pathname = '/content/'
filename = 'myshapefile.shp'
zip_file = 'myshapefile.zip'
with zipfile.ZipFile(zip_file, 'w') as zipf:
   zipf.write(myGDF.to_file(filename = '/content/myshapefile.shp', driver='ESRI Shapefile'))

But it only saves the .shp file in a zip folder, while the rest is written next to the zip folder.

How can I write a Geopandas DataFrame as a zipped shapefile directly?


Solution

  • Simply use zip as a file extension, keeping the name of the driver:

    myGDF.to_file(filename='myshapefile.shp.zip', driver='ESRI Shapefile')
    

    This should work with GDAL 3.1 or newer.