Search code examples
pandaspippython-packaging

Dependency conflict between pandas and openpyxl did not show up while trying to install packages


I am working on a Python service ( running with Python 3.10 ) where I am installing all the dependencies using pip install -r requirements.txt. Normally this throws an error as follows if there's a dependency confict between the files.

For example, if I am using these packages,python-datetutil (2.8.1), botocore (1.16.2) and pandas (2.0.0), it throws the below error

ERROR: Cannot install -r requirements.txt (line 3), pandas==2.0.0 and python-dateutil==2.8.1 because these package versions have conflicting dependencies.
The conflict is caused by:
The user requested python-dateutil==2.8.1
botocore 1.16.2 depends on python-dateutil<3.0.0 and >=2.1
pandas 2.0.0 depends on python-dateutil>=2.8.2
To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

However, it does not throw any error if there's a package support issue. For example, if I am using the package openpyxl ( 3.0.6 ) and pandas ( 2.0.0 ), the part of the code where openpyxl is used will throw the following error

Pandas require version '3.0.7' or newer of 'openpyxl' (version '3.0.6' is currently installed).

But this we get only when we run that part of the code.

Why is this error not shown while installing the package? Is there any other command that can be used to identify all such issues?


Solution

  • Why is this error not shown while installing the package?

    Because your requirements.txt file only contains pandas==2.0.0 and openpyxl is an optional dependency in the group excel.

    Check the pandas/pyproject.toml:

    excel = ['odfpy>=1.4.1', 'openpyxl>=3.0.7', 'pyxlsb>=1.0.8', 'xlrd>=2.0.1', 'xlsxwriter>=1.4.3']
    

    If you modify your requirements.txt like this:

    openpyxl==3.0.6
    pandas[excel]==2.0.0
    

    The command pip install -r requirements.txt will failed:

    ERROR: Cannot install openpyxl==3.0.6 and pandas[excel]==2.0.0 because these package versions have conflicting dependencies.
    
    The conflict is caused by:
        The user requested openpyxl==3.0.6
        pandas[excel] 2.0.0 depends on openpyxl>=3.0.7; extra == "excel"
    
    To fix this you could try to:
    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict
    

    Solution, remove openpyxl from requirements.txt and let pip manage the correct version.