Search code examples
pythonnumpyanacondacondapython-poetry

How does Poetry work regarding binary dependencies? (esp. numpy)


Until now I have been using conda as virtual environment and dependency management. However, some stuff does not work as expected when transfering my environment.yml file from my development machine to the production server. Now, I would like to look into alternatives. Poetry seems nice, especially because

poetry also maintains a lock file, and it has a benefit over pipenv because it keeps track of which packages are subdependencies. (https://realpython.com/effective-python-environment/#poetry)

which might improve stability quite a bit. However, I am working on science-heavy projects (matrices, data science, machine learning), so in practise I need the scipy stack (e.g. numpy, pandas, scitkit-learn).

Python became too slow for some pure computational workloads so numpy and scipy were born. [...] They are written in C and just wrapped as a python library.

Compiling such libraries brings a set of challenges since they (more or less) have to be compiled on your machine for maximum performance and proper linking with libraries like glibc.

Conda was introduced as an all-in-one solution to manage python environments for the scientific community.

[...] Instead of using a fragile process of compiling libraries on your machine, libraries are precompiled and just downloaded when you request them. Unfortunately, the solution comes with a caveat - conda does not use PyPI, the most popular index of python packages.

(https://modelpredict.com/python-dependency-management-tools#fnref:conda-compiling-challenges)

As far as I know, this doesn't even do Conda justice, because it does quite a bit of optimization to get the most out of my CPU/GPU/architecture for numpy. (https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/#Myth-#6:-Now-that-pip-uses-wheels,-conda-is-no-longer-necessary)

https://numpy.org/install/ itself advises to use conda, but also says that one can install via pip (and poetry uses pypi)

For users who know, from personal preference or reading about the main differences between conda and pip below, they prefer a pip/PyPI-based solution, we recommend:

[...] Use Poetry as the most well-maintained tool that provides a dependency resolver and environment management capabilities in a similar fashion as conda does.

I would like to get the stability of the poetry setup and the speed of the conda setup.

How does poetry handle binary dependencies? Does it also, like conda, consider my hardware?

If poetry not deliver in this regard, can I combine it with conda?


Solution

  • numpy provides several wheel files for different os, cpu architecture and python versions. wheel packages are precompiled, so the target system doesn't have to compile the package.

    poetry is able to choose the right wheel for you, depending on your system.

    Saying this, I would recommend using poetry, as long as you just need python packages, which are also available at pypi. As soon as you need other, non-python tools, stick to conda. (Disclaimer: I'm one of the maintainer of poetry).

    Also related: https://github.com/python-poetry/poetry/issues/1904