Search code examples
python-xarraygeopandas

subsetting the region of interest in a xarray dataset using a geojson file in salem


If i use a shape file to subset a region in a Xarray dataset, I use the following method:

def cookie_cut_shapefile(ds, shapeFile):
    shdf = salem.read_shapefile(shapeFile)
    ds_subset = ds.salem.subset(shape=shdf)
    ds_roi = ds_subset.salem.roi(shape=shdf)
    return ds_roi

But if i have GeoJSON file, then can I use salem library to do the subsetting? I know I can use regionmask but i would like to use salem as it is already being used in the project.

The workaround I have in mind to use salem with GeoJSON files is to first read the GeoJSON file in geopandas and then write it to shape file which can then be used with salem:

df = geopandas.read_file('sample.geojson')
df.to_file("test.shp")

The issue is that i have to write to disk to accomplish the job of converting geoJSON to shape file which i do not want to do. How do i handle this case more elegantly? Can salem directly work with GeoJSON somehow?


Solution

  • Salem's subset accepts geopandas dataframes, so there is no need for a file on disk. See for example this example in the documentation, where we use a geopandas DataFrame as input.

    One reason why it wouldn't work is if the geoJSON file has no coordinate system associated to it (shapefiles all do). In this case you'll need to specify it yourself. For example, for a Plate Carree projection:

    shdf.crs = salem.wgs84
    ds_subset = ds.salem.subset(shape=shdf)