I am currently developing a python package that uses cython
and numpy
and I want the package to be installable using the pip install
command from a clean python installation. All dependencies should be installed automatically. I am using setuptools
with the following setup.py
:
import setuptools
my_c_lib_ext = setuptools.Extension(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
setuptools.setup(
name="my_lib",
version="0.0.1",
author="Me",
author_email="[email protected]",
description="Some python library",
packages=["my_lib"],
ext_modules=[my_c_lib_ext],
setup_requires=["cython >= 0.29"],
install_requires=["numpy >= 1.15"],
classifiers=[
"Programming Language :: Python :: 3",
"Operating System :: OS Independent"
]
)
This has worked great so far. The pip install
command downloads cython
for the build and is able to build my package and install it together with numpy
.
Now I want to improve the performance of my cython
code, which leads to some changes in my setup.py
. I need to add include_dirs=[numpy.get_include()]
to either the call of setuptools.Extension(...)
or setuptools.setup(...)
which means that I also need to import numpy
. (See http://docs.cython.org/en/latest/src/tutorial/numpy.html and Make distutils look for numpy header files in the correct place for rationals.)
This is bad. Now the user cannot call pip install
from a clean environment, because import numpy
will fail. The user needs to pip install numpy
before installing my library. Even if I move "numpy >= 1.15"
from install_requires
to setup_requires
the installation fails, because the import numpy
is evaluated earlier.
Is there a way to evaluate the include_dirs
at a later point of the installation, for example, after the dependencies from setup_requires
or install_requires
have been resolved? I really like to have all dependencies resolved automatically and I dont want the user to type multiple pip install
commands.
The following snippet works, but it is not officially supported because it uses an undocumented (and private) method:
class NumpyExtension(setuptools.Extension):
# setuptools calls this function after installing dependencies
def _convert_pyx_sources_to_lang(self):
import numpy
self.include_dirs.append(numpy.get_include())
super()._convert_pyx_sources_to_lang()
my_c_lib_ext = NumpyExtension(
name="my_c_lib",
sources=["my_c_lib/some_file.pyx"]
)
The article How to Bootstrap numpy installation in setup.py proposes using a cmdclass
with custom build_ext
class. Unfortunately, this breaks the build of the cython
extension because cython
also customizes build_ext
.
First question, when is numpy
needed? It is needed during the setup (i.e. when build_ext
-funcionality is called) and in the installation, when the module is used. That means numpy
should be in setup_requires
and in install_requires
.
There are following alternatives to solve the issue for the setup:
setup_requires
-argument of setup
and postponing import of numpy
until setup's requirements are satisfied (which is not the case at the start of setup.py
's execution)Put next to setup.py
a pyproject.toml
-file , with the following content:
[build-system]
requires = ["setuptools", "wheel", "Cython>=0.29", "numpy >= 1.15"]
which defines packages needed for building, and then install using pip install .
in the folder with setup.py
. A disadvantage of this method is that python setup.py install
no longer works, as it is pip
that reads pyproject.toml
. However, I would use this approach whenever possible.
This approach is more complicated and somewhat hacky, but works also without pip
.
First, let's take a look at unsuccessful tries so far:
pybind11-trick
@chrisb's "pybind11"-trick, which can be found here: With help of an indirection, one delays the call to import numpy
until numpy is present during the setup-phase, i.e.:
class get_numpy_include(object):
def __str__(self):
import numpy
return numpy.get_include()
...
my_c_lib_ext = setuptools.Extension(
...
include_dirs=[get_numpy_include()]
)
Clever! The problem: it doesn't work with the Cython-compiler: somewhere down the line, Cython passes the get_numpy_include
-object to os.path.join(...,...)
which checks whether the argument is really a string, which it obviously isn't.
This could be fixed by inheriting from str
, but the above shows the dangers of the approach in the long run - it doesn't use the designed mechanics, is brittle and may easily fail in the future.
the classical build_ext
-solution
Which looks as following:
...
from setuptools.command.build_ext import build_ext as _build_ext
class build_ext(_build_ext):
def finalize_options(self):
_build_ext.finalize_options(self)
# Prevent numpy from thinking it is still in its setup process:
__builtins__.__NUMPY_SETUP__ = False
import numpy
self.include_dirs.append(numpy.get_include())
setupttools.setup(
...
cmdclass={'build_ext':build_ext},
...
)
Yet also this solution doesn't work with cython-extensions, because pyx
-files don't get recognized.
The real question is, how did pyx
-files get recognized in the first place? The answer is this part of setuptools.command.build_ext
:
...
try:
# Attempt to use Cython for building extensions, if available
from Cython.Distutils.build_ext import build_ext as _build_ext
# Additionally, assert that the compiler module will load
# also. Ref #1229.
__import__('Cython.Compiler.Main')
except ImportError:
_build_ext = _du_build_ext
...
That means setuptools
tries to use the Cython's build_ext if possible, and because the import of the module is delayed until build_ext
is called, it founds Cython present.
The situation is different when setuptools.command.build_ext
is imported at the beginning of the setup.py
- the Cython isn't yet present and a fall back without cython-functionality is used.
mixing up pybind11-trick and classical solution
So let's add an indirection, so we don't have to import setuptools.command.build_ext
directly at the beginning of setup.py
:
....
# factory function
def my_build_ext(pars):
# import delayed:
from setuptools.command.build_ext import build_ext as _build_ext#
# include_dirs adjusted:
class build_ext(_build_ext):
def finalize_options(self):
_build_ext.finalize_options(self)
# Prevent numpy from thinking it is still in its setup process:
__builtins__.__NUMPY_SETUP__ = False
import numpy
self.include_dirs.append(numpy.get_include())
#object returned:
return build_ext(pars)
...
setuptools.setup(
...
cmdclass={'build_ext' : my_build_ext},
...
)