Search code examples
pythondry

How can I break up this py file in a DRY fashion?


In the context of the business logic of a Flask app, I'm writing a ton of these "definition" instances of a class, putting them in a list, and importing the list where needed. Outside of building it, the list is treated as static.

Simplified example:

definitions.py:

from my_object import MyObject

definition_registry = list()

# team 1, widget 1 definition
_definition = MyObject()
_definition.name = "team 1 widget 1"
_definition.coercer = str
definition_registry.append(_definition)

# team 1, widget 2 definition
_definition = MyObject()
_definition.name = "team 1 widget 2"
_definition.coercer = int
definition_registry.append(_definition)

# team 2, widget 1 definition
_definition = MyObject()
_definition.name = "team 2 widget 1"
_definition.coercer = float
definition_registry.append(_definition)

my_object.py:

class MyObject:
    def __init__(self):
        self.name = "unnamed"
        self.coercer = int

    def __repr__(self):
        return f"MyObject instance: {self.name} / {self.coercer}"

main.py:

from definitions import definition_registry

if __name__ == '__main__':
    print(definition_registry)

Output:

[MyObject instance: team 1 widget 1 / <class 'str'>, MyObject instance: team 1 widget 2 / <class 'int'>, MyObject instance: team 2 widget 1 / <class 'float'>]

How can I break up definitions.py into multiple files (team_1.py, team_2.py, ...)?

Important caveat: The instances of the real MyObject have to be defined in python. In my example the coercer attribute is meant as a placeholder to reinforce that fact.

I thought about using exec, but that's generally bad practice, and this doesn't feel like a good exception to that rule. For example, putting lines 5 to 9 of definitions.py into team1w1.py and replacing them with exec(open(team1w1.py).read()) works but PyCharm's debugger doesn't execute team1w1.py line-by-line.

Another way would be to do something like

from team1w1 import definition
definition_registry.append(definition)

from team1w2 import definition
definition_registry.append(definition)
...

This is better but it still smells because

  • from ... import definition repeated over and over in the same file
  • import MyObject has to be repeated for every definition file

Solution

  • There are several ways to do this. Search for code to implement plugins. Here is one way to do it:

    Structure you code like so:

    /myproject
        main.py
        my_object.py
        definitions/
            __init__.py
            team_1.py
            team_2.py
    

    main.py

    This is basically the same as your code with some extra code to show what is happening.

    import sys
    
    before = set(sys.modules.keys())
    
    import definitions
    
    after = set(sys.modules.keys())
    
    if __name__ == '__main__':
        print('\nRegistry:\n')
        for item in definitions.registry:
            print(f"    {item}")
        print()
    
        # this is just to show how to access things in team_1
        print(definitions.team_1.foo)
        print()
    
        # this shows that the modules 'definitions', 'definitions.team_1',
        # and 'definitions.team_2' have been imported (plus others)
        print(after - before)
    

    my_object.py

    As others pointed out, MyObject could take the name and coercer as arguments to __init__(), and the registry could be a class variable with registration handled by __init__().

    class MyObject:
        registry = []
        
        def __init__(self, name="unnamed", coercer=str):
            self.name = name
            self.coercer = coercer
            
            MyObject.registry.append(self)
            
        def __repr__(self):
            return f"MyObject instance: {self.name} / {self.coercer}"
    

    definitions/init.py

    This is the core of the technique. When a package is imported, __init__.py gets run, such as when main.py has import definitions. The main idea is to use pathlib.Path.glob() to find all the files with names like team_* and import them using importlib.import_module():

    import importlib
    import my_object
    import pathlib
    
    # this is an alias to the class variable so it can be referenced
    # like definitions.registry
    registry = my_object.MyObject.registry
    
    package_name = __package__
    package_path = pathlib.Path(__package__)
    
    print(f"importing {package_name} from {__file__}")
    
    for file_path in package_path.glob('team_*.py'):
        module_name = file_path.stem
        print(f"    importing {module_name} from {file_path}")
        importlib.import_module(f"{package_name}.{module_name}")
    
    print("    done")
    

    definitions/team_1.py

    Needs to import MyObject to be able to create instances. Shows that there can be multiple MyObjects instantiated in the module, along with other things.

    import pathlib
    from my_object import MyObject
    
    file_name = pathlib.Path(__file__).stem
    
    print(f"        in {__package__}.{file_name}")
    
    # assign the object (can get it through registry or as team_1.widget_1
    widget_1 = MyObject("team 1 widget 1", str)
    
    # don't assign the object (can only get it through the registry)
    MyObject("team 1 widget 2", int)
    
    # can define other things too (variables, functions, classes, etc.)
    foo = 'this is team_1.foo'
    

    definitions/team_2.py

    from my_object import MyObject
    
    print(f"        in {__package__}.{__file__}")
    
    # team 2, widget 1 definition
    MyObject("team 2 widget 1", float)
    

    Other stuff

    If you can't change MyObject, perhaps you can subclass it and use the subclass in team_1.py, etc.

    Alternatively, define a make_myobject() factory function:

    def make_myobject(name="unknown", coercer=str):
        definition = MyObject()
        definition.name = name
        definition.coercer = coercer
        registry.append(definition)
        return definition
    

    Then team_1.py would look like:

    from my_object import make_myobject
    
    make_myobject("team 1 widget 1", int)
    
    ....
    

    Lastly, int, str, and other types, classes, etc can be looked up by name. So in your simplified example MyObject() or make_myobject() could take the name of a coercer and look it up.

    import sys
    
    def find_coercer(name):
        """Find the thing with the given name. If it is a dotted name, look
        it up in the named module. If it isn't a dotted name, look it up in
        the 'builtins' module.
        """
        module, _, name = name.strip().rpartition('.')
    
        if module == '':
            module = 'builtins'
    
        coercer = getattr(sys.modules[module], name)
    
        return coercer