Search code examples
pythongeopandas

How do you inplace reproject a geopandas GeoSeries?


EDIT: Answer: you don't.

Original question:

I just noticed, that geopandas GeoDataFrame allows for inplace reprojection:

In [1]: import geopandas as gpd
In [2]: import shapely.geometry as sg
In [3]: data = {'geometry': [sg.Point(9,53), sg.Point(10,53.5)]}

In [4]: gdf = gpd.GeoDataFrame(data, crs='epsg:4326')
In [5]: gdf
Out[5]: 
                    geometry
0   POINT (9.00000 53.00000)
1  POINT (10.00000 53.50000)

In [6]: gdf.to_crs('epsg:3395', inplace=True) #No problem
In [7]: gdf
Out[7]: 
                          geometry
0  POINT (1001875.417 6948849.385)
1  POINT (1113194.908 7041652.839)

...but GeoSeries does not:

In [8]: gs = gpd.GeoSeries(data['geometry'], crs='epsg:4326')
In [9]: gs
Out[9]: 
0     POINT (9.00000 53.00000)
1    POINT (10.00000 53.50000)
dtype: geometry

In [10]: gs.to_crs('epsg:3395', inplace=True) #Problem
TypeError: to_crs() got an unexpected keyword argument 'inplace'

In [11]: gs.to_crs('epsg:3395')
Out[11]: 
0    POINT (1001875.417 6948849.385)
1    POINT (1113194.908 7041652.839)
dtype: geometry

It is complicating things a bit in my app, as I was hoping to write a function that takes GeoDataframes and GeoSeries as *args and do a reprojection on each of them, without needing to return and re-assign the objects to their variables.

It is not a huge deal. I was mainly just wondering, why this is the case, as many other methods (like .dropna()) do allow an inplace argument in both the GeoDataFrame and the GeoSeries objects. So why not this specific method? Is it an oversight? Or is there a good reason for it that I'm unaware of? Or am I just using it wrong?

Many thanks!


PS: it's beyond the scope of this question, for those wondering about the use case: having an in-place version of a method is especially valuable when there are multiple variables pointing to a given object, and so there is a danger of some of these pointing to an 'old' (i.e., not reprojected) version of the object, leading to errors down the line. Here is a scenario:

gdf = self._geodataframe = gpd.GeoDataFrame(...) #saving dataframe as class variable
gdf.to_crs(..., inplace=True) # self._geodataframe is also reprojected

gs = self._geoseries = gpd.GeoSeries(...) #saving series as class variable
gs = gs.to_crs(...) #self._geoseries still has the original crs

Solution

  • GeoDataFrame to_crs is using GeoSeries to_crs to do the transformation, while GeoSeries.to_crs() is reprojecting geometries using apply. Apply does not allow in place transformation and no one actually tried to implement in place option for that manually.

    This is the part of the code responsible for transformation:

    transformer = Transformer.from_crs(self.crs, crs, always_xy=True)
    result = self.apply(lambda geom: transform(transformer.transform, geom))
    result.__class__ = GeoSeries
    result.crs = crs
    result._invalidate_sindex()
    return result
    

    I believe that there is no reason for not supporting it, but I might be as well wrong. No one probably thought of implementing it :). Feel free to open issue or make a PR on GitHub.