Search code examples
pythonmetadatapython-packaging

How to properly capture library / package version in my package when using pyproject.toml to build


I have moved away from a setup.py file to build my packages / libraries to fully using pyproject.toml. I prefer it overall, but it seems that the version placed in the pyproject.toml does not propagate through the build in any way. So I cannot figure out how to inject the package version -- or any other metadata provided in the pyproject.toml -- into my package.

A google search led me to this thread, which had some suggestions. They all seemed like hacks, but I tried this one because it seemed best:

from pip._vendor import tomli ## I need to be backwards compatible to Python 3.8

with open("pyproject.toml", "rb") as proj_file:
    _METADATA = tomli.load(proj_file)

DESCRIPTION = _METADATA["project"]["description"]
NAME = _METADATA["project"]["name"]
VERSION = _METADATA["project"]["version"]

It worked fine upon testing, but I did not test robustly enough: once I tried to install this in a fresh location / machine, it failed because the pyproject.toml file is not part of the package installation. (I should have realized this.)


So, what is the right / best way to provide metadata, like the package version, to my built package? I need the following requirements:

  1. I only want to provide the information once, in the pyproject.toml. (I know that if I need to repeat a value, at some point there will be a mismatch.)
  2. I want the information to be available to the end user, so that someone who installs the package can do something like mypackage.VERSION from her interactive Python session.
  3. I want to only use pyproject.toml and Poetry / PDM. (I actually use PDM, but I know that Poetry is more popular. The point is that I don't want a setup.py or setup.cfg hack. I want to purely use the new way.)

Solution

  • As you've seen, the pyproject.toml file from the original source tree is generally not available to read from within an installed package. This isn't new, it's also the case that the setup.py file from a legacy source tree would be missing from an actual installation. The package metadata, however, is available in a .dist-info subdirectory alongside your package installation.

    To demonstrate using a real example of an installed package and the corresponding metadata files, which contain the version info and other stuff coming from the setup.py or pyproject.toml:

    $ pip install -q six
    $ pip show --files six
    Name: six
    Version: 1.16.0
    Summary: Python 2 and 3 compatibility utilities
    Home-page: https://github.com/benjaminp/six
    Author: Benjamin Peterson
    Author-email: benjamin@python.org
    License: MIT
    Location: /tmp/eg/.venv/lib/python3.11/site-packages
    Requires: 
    Required-by: 
    Files:
      __pycache__/six.cpython-311.pyc
      six-1.16.0.dist-info/INSTALLER
      six-1.16.0.dist-info/LICENSE
      six-1.16.0.dist-info/METADATA
      six-1.16.0.dist-info/RECORD
      six-1.16.0.dist-info/REQUESTED
      six-1.16.0.dist-info/WHEEL
      six-1.16.0.dist-info/top_level.txt
      six.py
    $ grep '^Version:' .venv/lib/python3.11/site-packages/six-1.16.0.dist-info/METADATA
    Version: 1.16.0
    

    This .dist-info/METADATA file is the location containing the version info and other metadata. Users of your package, and the package itself, may access package metadata if/when necessary by using stdlib importlib.metadata. The version string is guaranteed to be there for an installed package, because it's a required field in the metadata specification, and there is a dedicated function for it. Other keys, which are optional, can be found using the metadata headers, e.g.:

    >>> from importlib.metadata import metadata
    >>> metadata("six")["Summary"]
    'Python 2 and 3 compatibility utilities'
    

    All versions of Python where accessing the metadata was non-trivial are now EOL, but if you need to maintain support for Python <= 3.7 then importlib-metadata from PyPI can be used the same way as stdlib.

    When you only publish one top-level directory containing an __init__.py file and it has the same name as the project name in pyproject.toml, then you may use the __package__ attribute so that you don't have to hardcode the package name in source code:

    # can be accessed anywhere within the package
    import importlib.metadata
    
    the_version_str = importlib.metadata.version(__package__)
    

    If the names are mismatching (example: the distribution package "python-dateutil" provides import package "dateutil"), or you have multiple top-level names (example: the distribution package "setuptools" provides import packages "setuptools" and "pkg_resources") then you may want to just hardcode the package name, or try and discover it from the installed packages_distributions() mapping.

    This satisfies point 1. and point 3.

    For point 2.:

    I want the information to be available to the end user, so that someone who installs the package can do something like mypackage.VERSION from her interactive Python session.

    The similar recipe works:

    # in your_package/__init__.py
    import importlib.metadata
    
    VERSION = importlib.metadata.version(__package__)
    

    However, I would recommend not to provide a version attribute at all. There are several disadvantages to doing that, for reasons I've described here and here. If you have historically provided a version attribute, and need to keep it for backwards compatibility concerns, you may consider maintaining support using a fallback module __getattr__ which will be invoked when an attribute is not found by the usual means:

    def __getattr__(name):
        if name == "VERSION":
            # consider adding a deprecation warning here advising caller to use
            # importlib.metadata.version directly
            return importlib.metadata.version("mypackage")
        raise AttributeError(f"module {__name__!r} has no attribute {name!r}")