Search code examples
pythonpylint

Pylint recursive bug when directory is prefix of another directory


I am observing weird behavior for pylint when I have a directory name being a prefix of another directory. Here is minimal setup to reproduce:

mkdir pylint_test
cd pylint_test
mkdir dataset
touch dataset/__init__.py
mkdir dataset_123
echo "from collections import Counter" > dataset_123/a.py
echo "[master]\nenable= all" > pylintrc
pylint --recursive=y .

I would expect to get an error for unused import in a.py, but this does not happen. What is more weird - if I remove the file __init__.py in dataset (or rename it) I get the output as expected:

pylint --recursive=y .
************* Module a
eval_123/a.py:1:0: C0114: Missing module docstring (missing-module-docstring)
eval_123/a.py:1:0: W0611: Unused Counter imported from collections (unused-import)

And what is more weird - if I do not remove __init__.py but rename the two directories to be eval and eval_123 respectively again everything works.

Another experiment is renaming dataset_123 to dataset_b123. In that case pylint reports the issues as expected (even when __init__.py is present.

I am on Mac Ventura 13.4.1 and here are my package versions:

pylint --version
pylint 2.15.2
astroid 2.13.5
Python 3.8.13 (default, Oct 19 2022, 17:52:09)
[Clang 12.0.0 ]

However this also reproduces with other python versions.

Does anyone have a clue what is going on here?


Solution

  • This appears to be a bug, and I would recommend raising an issue on the pylint GitHub repository.

    I were able to narrow down the bug to this line which is under _discover_files method. This method is responsible for discovering python modules and packages recursively(equivalent to --recursive=y option).

    if any(root.startswith(s) for s in skip_subtrees):
        # Skip subtree of already discovered package.
        continue
    

    I believe this check was intended to exclude the subtree(for example dataset/foo) if the search determines that we are already in a package(for example dataset/__init__.py). However, it also inadvertently excludes other repositories that startswith the same name.