Search code examples
pythonsetuptoolspackagingpython-importlib

importlib.metadata doesn't appear to handle the authors field from a pyproject.toml file correctly


I'm having a problem retrieving the author metadata information of a package through importlib.metadata.

My package setup is as follows:

I'm using Python's setuptools with a minimal setup.py and some project metadata in pyproject.toml.

setup.py:

from setuptools import setup
setup()

pyproject.toml:

[project]
name = "myproj"
version = "0.1.0"
description = "Some description"
authors = [
    {name = "Some One", email = "[email protected]"},
    {name = "Another Person", email = "[email protected]"}
]
readme = "README.md"

requires-python = ">=3.11"
license = {text = "BSD-3-Clause"}

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

(I think setup.py is not even necessary for this, but for completeness.)

The pyproject.toml follows keywords described in the Python packaging user guide - Declaring project metadata.

The project itself is very simple: it's structure is

myproj/
       __init__.py

and in myproj/__init__.py, I'm using importlib.metadata to get some metadata information about the package.:

from importlib.metadata import metadata

meta = metadata(__package__ or __name__)
print(meta['Name'])
print(meta['Version'])
print(meta['Author-email'])
print(meta['Author'])

I install the package in a virtualenv using pip.

Outside of the project directory, but with the virtualenv activated, I run Python and import the package:

python -c "import myproj"

This prints the various metadata fields:

myproj
0.1.0
Some One <[email protected]>, Another Person <[email protected]>
None

This is where the problem is: the "Author-email" field yields the full [project.authors] entry, as a single string, while the "Author" field yields None.

Is there a way to get the individual authors (without their email) out of the metadata from within the package?

Or is importlib.metadata not up to date with the packaging metadata? (I did a quick search on bugs.python.org, but couldn't find anything related.)

(Of course, I can parse the "Author-email" field manually and separate it into appropriate parts, but it would be nice if that can be avoided.)


Solution

  • Regarding the "author" and "author email" fields in Python packaging metadata:


    If I were you I would look into Python's own email library. There is a good chance it has the necessary features to parse the authors metadata properly.

    It seems like something like the following could help (modified from this answer):

    import email.policy
    
    addresses = 'Some One <[email protected]>, Another Person <[email protected]>'
    
    em = email.message_from_string(
        f'To: {addresses}',
        policy=email.policy.default,
    )
    
    for address in em['to'].addresses:
        print(f'{address.display_name} <{address.addr_spec}>')