I have a Python module directory structure like this:
my_module
|--__init__.py
|--public_interface
| |--__init__.py
| |--my_sub_module
| | |--__init__.py
| | |--code.py
| |--some_more_code.py
|--other directories omitted
Now, the public_interface
directory (among several others) is only there to organize the code into logical sub-units, as a guideline for me and other developers. The eventual user of my_module
shall only see it as my_module.my_sub_module
without the public_interface
in-between.
I wrote these __init__.py
files:
my_module.__init__.py
:from .public_interface import *
and
my_module.public_interface.__init__.py
:from . import my_sub_module from .some_more_code import *
and
my_module.public_interface.my_sub_module.__init__.py
:from .code import *
This works fine as long as the user imports only the top-level module:
import my_module
my_module.my_sub_module.whatever # Works as intended
However, this does not work:
from my_module import my_sub_module
nor:
import my_module.my_sub_module
What would I have to change to make these last two imports work?
The import system only allows actual packages and modules to be imported directly as part of the dotted module name, but your:
from .public_interface import *
hack just makes my_sub_module
an attribute of the my_module
package, not an actual submodule for the purposes of the import system. It breaks for the same reason doing:
from collections._sys import *
breaks; yes, as an implementation detail, the collections
package happens to import sys
aliased to _sys
, but that doesn't actually make _sys
a subpackage of collections
, it's just one of many attributes on the collections
package. From the import machinery's point of view, my_sub_module
is no more a submodule of my_module
than _sys
is of collections
; the fact that nested in a sub-directory under my_module
is irrelevant.
That said, the import system provides a hook to allow you to treat additional arbitrary directories as being part of package, the __path__
attribute. By default, __path__
just includes the path to the package itself (so my_module
's __path__
defaults to ['/absolute/path/to/my_module']
), but you can programmatically manipulate it however you want; when resolving submodules, it will search only through the final contents of __path__
, much like importing top level modules searches sys.path
. So to resolve your particular case (wanting all packages/modules in public_interface
to be importable without specifying public_interface
in the import line), just change your my_module/__init__.py
file to have the following contents:
import os.path
__path__.append(os.path.join(os.path.dirname(__file__), 'public_interface'))
All that does is tell the import system that, when import mymodule.XXXX
occurs (XXXX
is a placeholder for a real name), if it can't find my_module/XXXX
or my_module/XXXX.py
, it should look for my_module/public_interface/XXXX
or my_module/public_interface/XXXX.py
. If you want it to search public_interface
first, change it to:
__path__.insert(0, os.path.join(os.path.dirname(__file__), 'public_interface'))
or to have it only check public_interface
(so nothing directly under my_module
is importable at all), use:
__path__[:] = [os.path.join(os.path.dirname(__file__), 'public_interface')]
to replace the contents of __path__
entirely.
Side-note: You might wonder why os.path
is an exception to this rule; on CPython, os
is a plain module with an attribute path
(which happens to be the module posixpath
or ntpath
depending on platform), yet you can do import os.path
. This works because the os
module, while being imported, explicitly (and hackily) populates the sys.modules
cache for os.path
. This isn't normal, and it has a performance cost; import os
must always import os.path
implicitly, even if nothing from os.path
is ever used. __path__
avoids that problem; nothing is imported unless requested.
You could achieve the same result by making my_module/__init__.py
contain:
import sys
from .public_interface import my_sub_module
sys.modules['my_module.my_sub_module'] = my_sub_module
which would allow people to use my_module.my_submodule
having only done import my_module
, but that would force any import
of my_module
to import public_interface
and my_sub_module
, even if nothing from my_sub_module
is ever used. os.path
continues to do it for historical reasons (using os.path
APIs with only import os
a long time ago, and a lot of code relies on that misbehavior because programmers are lazy and it worked), but new code shouldn't use this hack.