Search code examples
pythongeospatialgeopandasshapely

Why does geopandas' `.to_crs()` give (inf, inf) the first time and correct result the second time for the same inputs?


When I run geopandas' .to_crs() function twice in a row on the same inputs, I get two different results.

Here's my environment:

name: geotest
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.11
  - geopandas

Here's a minimal example run within that environment:

import geopandas as gpd
import shapely

print(
    gpd.GeoDataFrame(
        geometry=[shapely.geometry.Point(-100,40)], crs='EPSG:4326')
    .to_crs('ESRI:102008')
)
# Exactly the same command
print(
    gpd.GeoDataFrame(
        geometry=[shapely.geometry.Point(-100,40)], crs='EPSG:4326')
    .to_crs('ESRI:102008')
)

From which I get:

$ python minimal.py
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
          geometry
0  POINT (inf inf)
                       geometry
0  POINT (-321422.376 6782.160)

I would expect the second answer for both commands.

The relevant parts of my conda list are:

python                    3.11.7               hf27a42d_0  
gdal                      3.6.2           py311he4f215e_4  
geopandas                 0.14.2          py311hecd8cb5_0  
geopandas-base            0.14.2          py311hecd8cb5_0  
geos                      3.8.0                hb1e8313_0  
pyproj                    3.6.1           py311h717f92e_0  
shapely                   2.0.1           py311ha6175ea_0  

I'm on a mac M1, but using the Intel version of Anaconda for compatibility with other packages.


Solution

  • Thanks to @Diego who opened an issue with geopandas (https://github.com/geopandas/geopandas/issues/3433), I came across two different ways to solve this problem (which originates from pyproj):

    Option 1: Disable network use

    As described at https://github.com/pyproj4/pyproj/issues/705, add these lines to the preamble:

    import pyproj
    pyproj.network.set_network_enabled(False)
    

    Option 2: Pre-download projection data

    As described at https://pyproj4.github.io/pyproj/stable/transformation_grids.html, add the proj-data package to the conda environment. It's also safest to use a single channel. Either run conda install -c conda-forge proj-data or use this environment:

    name: geotest
    channels:
      - conda-forge
    dependencies:
      - python
      - geopandas
      - proj-data