Search code examples
pythonpython-3.xabsolute-pathclass-members

Python class membership and absolute paths


I'm working on a project where the source code is nested in the following structure:

|-- my-project
    |-- test_notebook.ipynb
    |-- package
        |-- src
        |    |-- module1.py    #defines MyClass
        |    |-- module2.py    #defines MyOtherClass 
        |    |-- some_other_modules.py
        |    |-- __init__.py
        |-- tests
             |-- some_modules.py

Now, the problem is as follows. my_method in module1.py is a method of MyClass. my_method looks as follows:

from src.module2 import MyOtherClass

class MyClass:

    def my_method(self, list_of_objects: list):

        if all(isinstance(obj, str) for obj in list_of_objects):
            #do something

        elif all(isinstance(obj, MyOtherClass) for obj in list_of_objects):
            #do something else
    
        else:
            raise TypeError("This is a test")        

If I create a list of MyOtherClass instances o inside module1 itself and test my_method, then it works as expected: the elif clause is satisfied. However, if I create the same list of instances in test_notebook.ipynb, then the class of o is <class package.src.module2.MyOtherClass>. Thus, if I call my_method, then a TypeError is raised: Python (I'm using version 3.10) considers <class package.src.module2.MyOtherClass> to be different from <class MyOtherClass>.

from package.src.module1 import MyClass
from package.src.module2 import MyOtherClass

obj1 = MyClass()

other_class_obj_1 = MyOtherClass()
other_class_obj_2 = MyOtherClass()
other_class_obj_3 = MyOtherClass()

result = obj1.my_method([other_class_obj_1, other_class_obj_2, other_class_obj_3])

The TypeError is raised:

TypeError: This is a test

I tried rewriting the condition using the absolute path of MyOtherClass, as follows:

elif all(isinstance(obj, package.src.module2.MyOtherClass)) for obj in list_of_objects:
    #do_something

But that did not work, since Python is not able to resolve package.src.module2.MyOtherClass. Any idea how to solve this?

I'm adding the value of sys.path below:

>>> import sys
>>> print(sys.path)
['', '/Users/my_name/Documents/Python_Projects/my-project/package', '/Library/Frameworks/Python.framework/Versions/3.10/lib/python310.zip', '/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10', '/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/lib-dynload', '/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages']

Solution

  • Having posted your sys.path and import the problem is clear. Your sys.path, after the '' (which has special meaning that's not relevant here) begins with:

    '/Users/my_name/Documents/Python_Projects/my-project/package'
    

    and does not at any point contain:

    '/Users/my_name/Documents/Python_Projects/my-project'
    

    This means that if you try to import with import spam or from spam import eggs, and it reaches that component of sys.path without finding it (likely unless it's found in the working directory, which can confuse things; you might see your code work fine if you ran Python interactively from /Users/my_name/Documents/Python_Projects/my-project, or if you ran a script located in that directory), it will be looking for a file named /Users/my_name/Documents/Python_Projects/my-project/package/spam.py (a module) or a directory named /Users/my_name/Documents/Python_Projects/my-project/package/spam/ (a package).

    Your imports are asking for from package.src.module1 import MyClass, so the only way they'll be found in /Users/my_name/Documents/Python_Projects/my-project/package is if the rest of your files are in a subdirectory of package also named package.

    Two fixes are possible:

    1. Change your imports to remove the package. prefix, e.g.:

      from package.src.module1 import MyClass
      from package.src.module2 import MyOtherClass
      

      becomes:

      from src.module1 import MyClass
      from src.module2 import MyOtherClass
      

      Probably not what you want, since package is a logical part of the module name, so we go to the other option.

    2. Change whatever you're doing to modify sys.path (whether it's manual manipulation within the Python script, the hack of setting the PYTHONPATH environment variable to point to that directory, hacking your sys.path via a custom .pth file in your site-packages directory, or some complicated IDE configuration that accomplishes a similar end result) so that instead of adding '/Users/my_name/Documents/Python_Projects/my-project/package' it adds '/Users/my_name/Documents/Python_Projects/my-project' (without the final /package). This will make the existing from package.src.module1 import XXX imports work, because now it will be scanning the my-project directory, not the package directory, looking for a directory named package (which my-project contains, and package does not).

      Psychic debugging says you're probably using PYTHONPATH hacks (if you were manually tweaking sys.argv, it's unlikely you'd remember to put it after the leading '' entry, and all the other options are not as discoverable if you don't already know the import system by heart), e.g. you've got:

      export PYTHONPATH="$HOME/Documents/Python_Projects/my-project/package"
      

      in one of your shell config files (~/.bashrc, ~/.bash_profile, etc.) or you ran it manually in the shell you're running your code from, or possibly doing something similar in a Jupyter notebook configuration. So find wherever you're doing that, and trim off the /package component.

    For the future, you probably want to make a proper setup.py and/or setup.cfg for your project so you can build a distribution package for it (source tarball or a wheel or the like) and install it properly with python3 -mpip, so that you don't rely on hacks like PYTHONPATH at all (it's not as awful as it used to be, now that few people use Python 2, but it's still an issue where it's completely unversioned and can be set in a million ways, so if you inadvertently have code there that's only syntactically valid in some versions of Python, e.g. any code with a variable or function or whatever named async in it stopped being legal Python in 3.7, you have to rewrite it to be perfectly portable, where properly installed modules can be installed separately for each version of Python, installed globally or per-user, or even isolated within virtual environments created by the venv or virtualenv packages).