Search code examples
pythondependenciespackage-managersrequirements.txt

Method to determine lowest required versions of python packages for a project/package?


This question concerns any package, not just Python version itself. To give some context: we are planning to build an internal package at work, which naturally will have many dependencies. To give freedom for our developers and avoid messy version conflicts, I want to specify broader constraints for packages requirements(.txt), for example, pandas>=1.0 or pyspark>=1.0.0, <2.0.

There is a way to efficiently determine/test which are the lowest required versions for a given code?

I could install pandas==0.2.4 and I see if the code runs, and so on, but that approach seems to get out of hand pretty fast. It's the first time I work on package building, so I am kinda lost on that. Looking at other package's source-code (on GitHub) didn't help me, because I have no idea what is the methodology developers use to specify dependency constraints.


Solution

  • Installing a project using its "lower bounds" dependencies is useful to ensure that (a) you have those lower bounds specified properly and (b) the tests pass.

    This is a popular feature request for pip, and is being tracked in Add a resolver option to use the specified minimum version for a dependency #8085. The issue has been open for years and there doesn't appear to be sufficient volunteer time to implement it, despite the support and interest evident on the issue tracker. Pip's solver only supports targeting the latest satisfactory versions of dependencies.

    Fortunately, alternate resolver strategies have been implemented in a different frontend uv. Using uv pip install can be used as a (mostly) drop-in replacement for pip install, but you'll have a new resolver option available:

    --resolution resolution The strategy to use when selecting between the different compatible versions for a given package requirement.

    By default, uv will use the latest compatible version of each package (highest).

    Possible values:

    • highest: Resolve the highest compatible version of each package
    • lowest: Resolve the lowest compatible version of each package
    • lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies

    Helpful links:


    Let's see what this looks like using a popular Python package with a few dependencies: requests.

    For this example, I'll create parallel venvs based on a Python 3.9 interpreter, and manage them externally, creating a uv installation in the second env.

    Note: Usually I'd recommend a global uv installation not tied to any particular env, and let it autodetect the env (see uv installation about that), but for the purposes of this example it's clearer if the uv installation is tied to .venv2.

    $ python3.9 -m venv .venv1
    $ python3.9 -m venv .venv2
    $ .venv2/bin/python3 -m pip install -q uv
    

    Doing a Python pip install in the fresh .venv1:

    $ .venv1/bin/python -m pip install -q requests
    $ .venv1/bin/python -m pip freeze
    certifi==2024.7.4
    charset-normalizer==3.3.2
    idna==3.8
    requests==2.32.3
    urllib3==2.2.2
    

    It selected the latest versions of dependencies. By default uv would select those same versions:

    $ .venv2/bin/python -m uv pip install requests
    Resolved 5 packages in 1ms
    Installed 5 packages in 1ms
     + certifi==2024.7.4
     + charset-normalizer==3.3.2
     + idna==3.8
     + requests==2.32.3
     + urllib3==2.2.2
    

    However, with a lowest resolution strategy we will get older urllib3 1.x, amongst other things:

    $ .venv2/bin/python -m uv pip install --resolution=lowest --force-reinstall requests==2.32.3
    Resolved 5 packages in 21ms
    Prepared 5 packages in 0.69ms
    Uninstalled 5 packages in 1ms
    Installed 5 packages in 2ms
     - certifi==2024.7.4
     + certifi==2017.4.17
     - charset-normalizer==3.3.2
     + charset-normalizer==2.0.0
     - idna==3.8
     + idna==2.5
     ~ requests==2.32.3
     - urllib3==2.2.2
     + urllib3==1.21.1
    

    This matches the lower bounds advertised by requests v2.32.3 here.

    Does it still work? Yes!

    $ .venv2/bin/python -c 'import requests; print(requests.get("https://exmaple.org"))'
    <Response [200]>
    

    Does it still work if we install even lower deps? Probably not!

    $ .venv2/bin/python -m uv pip install 'urllib3<1.21'
    Resolved 1 package in 41ms
    Prepared 1 package in 31ms
    Uninstalled 1 package in 0.59ms
    Installed 1 package in 0.62ms
     - urllib3==1.21
     + urllib3==1.20
    $ .venv2/bin/python -c 'import requests; print(requests.get("https://exmaple.org"))'
    .venv2/lib/python3.9/site-packages/requests/__init__.py:113: RequestsDependencyWarning: urllib3 (1.20) or chardet (None)/charset_normalizer (2.0.0) doesn't match a supported version!
      warnings.warn(
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File ".venv2/lib/python3.9/site-packages/requests/api.py", line 73, in get
        return request("get", url, params=params, **kwargs)
      File ".venv2/lib/python3.9/site-packages/requests/api.py", line 59, in request
        return session.request(method=method, url=url, **kwargs)
      File ".venv2/lib/python3.9/site-packages/requests/sessions.py", line 589, in request
        resp = self.send(prep, **send_kwargs)
      File ".venv2/lib/python3.9/site-packages/requests/sessions.py", line 703, in send
        r = adapter.send(request, **kwargs)
      File ".venv2/lib/python3.9/site-packages/requests/adapters.py", line 633, in send
        conn = self.get_connection_with_tls_context(
      File ".venv2/lib/python3.9/site-packages/requests/adapters.py", line 489, in get_connection_with_tls_context
        conn = self.poolmanager.connection_from_host(
    TypeError: connection_from_host() got an unexpected keyword argument 'pool_kwargs'
    

    This technique of using --resolution=lowest-direct and/or --resolution=lowest in combination with a good test suite can be used to determine accurate lower bounds for your project metadata directly and/or recursively.

    Additionally, you can use the same feature to verify adequacy of the specified lower bounds in your project's CI. This later point is very important, because if you only test your changes using a pip install, you'll be getting the latest available versions of packages by default and will have no visibility of whether a change is breaking for older versions of dependencies. Also testing in your CI using a lowest resolution strategy you can gain visibility about code changes you've made which may necessitate setting a stricter lower bound in the package metadata. You can decide then whether to increase the lower bound, or rework the changes in a way that preserves compatibility with older dependencies.


    Another useful tool tangentially related to this question is pypi-timemachine. It filters the index so that only packages older than a given timestamp are available. This allows adding time-based version bounds to an installation, to simulate installation into a user environment which was created at some time in the past.