Search code examples
pythonsetuptoolspython-packagingpyproject.toml

Packaging with pyproject.toml that will include other multi level directories


I have spent two full days trying to figure this out with no success. I have a python project that I want to package. That is the easy part. The part I can't figure out is how to copy other multilevel folders into the package.

I am trying to get the myproj as a package with the dosomething.py but also have the algorithms and configs folders in there entirety (folder structure and all files included).

Project structure:

 - myproj
     - src
         - myproj
             - dosomething.py
     - tests
         - test_do_something.py
     - algorithms
         - Type1
             - algorithm1
                 - eval.py
                 - hyperparameters.yml
                 - a bunch of other folders and files
             - algorithm2
                 - eval.py
                 - hyperparameters.yml
                 - a bunch of other folders and files
         - Type2
             - algorithm3
                 - eval.py
                 - hyperparameters.yml
                 - a bunch of other folders and files
             - algorithm4
                 - eval.py
                 - hyperparameters.yml
                 - a bunch of other folders and files
     - configs
         - type1.yml
         - type2.yml
     - pyproject.toml
     - mypy.ini
     - tox.ini
     - run-mypy

Contents of pyproject.toml:

[build-system]
requires = ["setuptools >= 61.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "myproj"
version = "0.0.1"
readme = "README.md"
dependencies = [
    "numpy >= 1.26.4",
    ]

[project.urls]
repository = "https://gitlab.com/somerepo

[project.optional-dependencies]
dev = [
    "tox",
    "pytest",
    "pytest-sugar",
    "pytest-cov",
    "black",
    "flake8",
    "flake8-docstrings",
    "mypy",
    "pre-commit",
    "sphinx",
    "sphinx-rtd-theme"
]

After doing a pip install in a new conda env, I am looking for the following:

- env/test_env/lib/python3.9/site-packages
    - myproj
        - algorithms and everything under it
        - configs and everything under it
        - dosomething.py

I have tried MANIFEST.in, I have tried everything on this page: https://setuptools.pypa.io/en/latest/userguide/datafiles.html

I figure this would be trivial to copy folders and files in the entirety into a package folder but I am stumped.

Any help would be greatly appreciated!


Solution

  • There are 2 standard directory structures automatically supported by setuptools without extra configuration:

    1. The vanilla src-layout: in summary, whatever you put inside the src directory will be copied to site-packages during the installation.
    2. The vanilla flat-layout: in summary, there is a single folder directly inside the your project root (named after the "import-name" you want your users to use) that will itself be copied to site-packages during the installation.

    Note that in both cases a part of the directory structure in the source-code is mimicked inside site-packages.

    In your example, it seems that your want to cherry-pick different parts of your source-code and reassemble them in a different directory structure inside site-packages (with no identical counterpart in the source-code). This is a non-trivial customisation that requires extra configuration to happen:

    1. You need to explicitly set tool.setuptools.packages, since the auto-discovery cannot handle custom project layouts. Probably something like (untested, may require adjustments):

      [tool.setuptools]
      packages = ["myproj", "myproj.algorithms", "myproj.configs", ...]
      

      You need to write all nested packages. Also please note that Python all folders are considered packages and can be imported. Even if they don't contain a __init__.py.

    2. You need to set tool.setuptools.package-dir to "remap"/"reassemble" your files into the custom structure. Probably something like (untested, may require adjustments):

      [tool.setuptools.package-dir]
      myproj = "src/myproj"
      myproj.algorithms = "algorithms"
      myproj.configs = "configs"
      
    3. (Optional) If you want to avoid having to write MANIFEST.in, you can also add tool.setuptools.package-data (untested, may require adjustments):

      [tool.setuptools.package-data]
      "" = ["*.yml", ...]
      

      Here "" (empty string` is a special value that means "any package").

    This can be quite tedious (and also require changes in pyproject.toml every time you create a new folder). That is why most of the people tend to go with "convention over configuration" and just re-structure their project folders in a way that matches either the vanilla src-layout or flat-layout.