Search code examples
pythonpytestshapely

Why invoking a python program in another python program will make aboriginal program slower?


In my tool hydrotopo, when I run find_edge_nodes to find topological relations for about 100,000 linestrings, it only costs about 10 seconds; but in code below, find_edge_nodes will cost about 30min:

# extract from https://github.com/iHeadWater/torchhydro/blob/dev-gnn/torchhydro/datasets/data_sets.py#L872
import hydrotopo.ig_path as htip

# the same method costs about 30min
graph_lists = htip.find_edge_nodes(node_features, network_features, node_idx, 'up', cutoff)
return graph_lists

In short, when I invoke my tool in other programs, speed of the program will become very slow, and when I pause the program in debug mode, the program will stop in predicates.has_z. There is not too many cycles in my code.

So why will has_z cost so much time, or I should find other solutions?

If you want to know more details, please see this: https://github.com/shapely/shapely/issues/2108


Update: hydrotopo and torchhydro both use 3.11, and they are all tested in Intellij IDEA with pytest.

I'm sure hydrotopo is compiled (/home/username/.conda/envs/torchhydro1/lib/python3.11/site-packages/hydrotopo/__pycache__/ig_path.cpython-311.pyc)


I try to debug test_run_model by pytest, but it crashed with this:

(torchhydro1) username@vm-jupyterhub-server:~/torchhydro$ pytest experiments/train_with_era5land_gnn.py::run_test_model

platform linux -- Python 3.11.9, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/username/torchhydro
configfile: setup.cfg
collected 0 items / 1 error                                                                                                                                                            

ImportError while importing test module '/home/username/torchhydro/experiments/train_with_era5land_gnn.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../.conda/envs/torchhydro1/lib/python3.11/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
experiments/train_with_era5land_gnn.py:14: in <module>
    from torchhydro.configs.config import cmd, default_config_file, update_cfg
E   ModuleNotFoundError: No module named 'torchhydro'

ERROR: found no collectors for /home/username/torchhydro/experiments/train_with_era5land_gnn.py::run_test_model

Solution

  • I know what happened to make my program became slower. Dependencies to test hydrotopo are numpy==1.26 shapely==2.0.1, the program is normal; however when I use numpy==2.0 shapely==2.0.5, performance became worse. I have downgraded the two packages and report this to shapely developers.

    It has no connections with IDEA or pycharm.