Search code examples
pythonsetuptools

Build multiple wheels at runtime


I want to build wheels during runtime that contain some scripts as well as some payload data. E.g. in this example the target folder contains two simple builds build_123 and build_124 to be packaged as a wheel.

main_project
├── __init__.py
├── whl_util.py               # wheel building script posted below
target/
├── build_123/                # contains one build to be packaged as a whl
│   └── mypkg
|       ├── __init__.py
│       ├── data
|       |   ├── __init__.py
|       |   └── mat.json
│       └── main
|           ├── __init__.py
|           └── dumpmat.py
└── build_124/                # contains another build to be packaged as a whl
    └── mypkg
        ├── ...

In my scenario this wheels are used as an output format and the packaging is not the main progress. The wheel packaging should be considered as a simple IO operation that reads a build folder and outputs a wheel with no side effects other than that. To perform this task I came up with this solution:

# main_project/whl_util.py
from setuptools import setup, find_packages
import sys
import shutil
import os

def bdist_wheel(build_dir=".", dist_dir=None):
    # backing up argv to restore them afterwards
    argv_bak = sys.argv[:]

    # clear args from running script with "bdist_wheel"
    file = sys.argv[0]
    sys.argv.clear()
    sys.argv.extend([file, "bdist_wheel"])

    if dist_dir is not None and "--dist-dir" not in sys.argv:
        sys.argv.extend(["--dist-dir", dist_dir])

    sys.argv.extend(["clean", "--all"])

    setup(
        name="mypkg",
        version=0.1,
        packages=find_packages(build_dir),
        install_requires=[],
        include_package_data=True,
        package_dir={'': build_dir},
        package_data={"mypkg.data": ["mat.json"]}
    )

    # restore args
    sys.argv.clear()
    sys.argv.extend(argv_bak)


def main():
    # Adding main method here for testing.
    # As mentioned in my actual scenario the wheels should be built as an output format at runtime
    print("BUILD 123")
    bdist_wheel("target/build_123", dist_dir="target/dist_123")
    print("BUILD 124")
    bdist_wheel("target/build_124", dist_dir="target/dist_124")

if __name__ == "__main__":
    main()

I also don't really like the way of passing parameters to setuptools via sys.argv but it seems to be the only way. However, the main issue is, that the first wheel is built normally while the second call of bdist_wheel/ setup raises an error:

python3 -m main_project.whl_util
BUILD 123
running bdist_wheel
running build
running build_py
creating build
creating build/lib
creating build/lib/mypkg
copying target/build_123/mypkg/__init__.py -> build/lib/mypkg
creating build/lib/mypkg/data
copying target/build_123/mypkg/data/__init__.py -> build/lib/mypkg/data
creating build/lib/mypkg/main
copying target/build_123/mypkg/main/__init__.py -> build/lib/mypkg/main
copying target/build_123/mypkg/main/dumpmat.py -> build/lib/mypkg/main
running egg_info
writing target/build_123/mypkg.egg-info/PKG-INFO
writing dependency_links to target/build_123/mypkg.egg-info/dependency_links.txt
writing top-level names to target/build_123/mypkg.egg-info/top_level.txt
writing manifest file 'target/build_123/mypkg.egg-info/SOURCES.txt'
copying target/build_123/mypkg/data/mat.json -> build/lib/mypkg/data
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/mypkg
copying build/lib/mypkg/__init__.py -> build/bdist.linux-x86_64/wheel/mypkg
creating build/bdist.linux-x86_64/wheel/mypkg/data
copying build/lib/mypkg/data/__init__.py -> build/bdist.linux-x86_64/wheel/mypkg/data
copying build/lib/mypkg/data/mat.json -> build/bdist.linux-x86_64/wheel/mypkg/data
creating build/bdist.linux-x86_64/wheel/mypkg/main
copying build/lib/mypkg/main/__init__.py -> build/bdist.linux-x86_64/wheel/mypkg/main
copying build/lib/mypkg/main/dumpmat.py -> build/bdist.linux-x86_64/wheel/mypkg/main
running install_egg_info
Copying target/build_123/mypkg.egg-info to build/bdist.linux-x86_64/wheel/mypkg-0.1-py3.7.egg-info
running install_scripts
creating build/bdist.linux-x86_64/wheel/mypkg-0.1.dist-info/WHEEL
creating 'target/dist_123/mypkg-0.1-py3-none-any.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'mypkg/__init__.py'
adding 'mypkg/data/__init__.py'
adding 'mypkg/data/mat.json'
adding 'mypkg/main/__init__.py'
adding 'mypkg/main/dumpmat.py'
adding 'mypkg-0.1.dist-info/METADATA'
adding 'mypkg-0.1.dist-info/WHEEL'
adding 'mypkg-0.1.dist-info/top_level.txt'
adding 'mypkg-0.1.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
BUILD 124
running bdist_wheel
running build
running build_py
copying target/build_124/mypkg/__init__.py -> build/lib/mypkg
error: could not create 'build/lib/mypkg/__init__.py': No such file or directory

Process finished with exit code 1

The no such file or directory error suggests that the setuptools module keeps track of what folders it already created and assumes that these folders still exists. However after building the first wheel the clean script will delete the built folder (which is necessary, since setuptools will otherwise reuse the folder without clearing it)

My only working solution yet is to fork the process before applying setup:

pid = os.fork()
if pid == 0:
    setup(...)
    sys.exit(0)
os.waitpid(pid, 0)

But since this seems very dirty and also my main process is very memory intensive, I'd rather avoid this method.

So my main question: Is there a way to build a wheel without any side effects? Or is there a way to reset the state of the setuptools module after applying setup? In an optimal world I would like to create the wheel in an in-memory PyFilesystem and only write the wheel to disk.


Solution

  • I am not sure setuptools is meant to be used this way. As far as I know pip and co. (wheel, setuptools, etc.) don't really have public APIs or at least no friendly ones.

    The distlib library looks like a promising alternative with an actual API. See distlib's documentation on "Using the wheel API".

    If this doesn't work then I probably would give one of those a try:

    subprocess.check_call([sys.executable, '-m', 'wheel', 'pack', 'target/build123'])
    

    See wheel pack documentation.

    subprocess.check_call([sys.executable, '-m', 'pip', 'wheel', 'target/build123'])
    

    See pip wheel documentation. And the reasons why it can't be used with API calls are noted in the "Using pip from your program" section of pip's documentation


    There is a somewhat similar question with some interesting ideas:

    Maybe generating a setup.py dynamically at run time could help.