Search code examples
pythondjangodjango-fixtures

How to instantiate class by package_name.module_name.class_name "path"


I want to implement something a bit similar to django fixture system where in fixture you set model property which indicates a model class of fixture. It looks something like this

my_app.models.my_model

My question is what is the standard way to process such a string in order to create the instance of the class pointed by this "path". I think it should look something like:

  1. Split it to module name and class name parts
  2. Load module (if not loaded)
  3. Acquire class from module by its name
  4. Instantiate it

How exactly should I do it?

Edit: I came up with a dirty solution:

def _resolve_class(self, class_path):
    tokens = class_path.split('.')
    class_name = tokens[-1]
    module_name = '.'.join(tokens[:-1])
    exec "from %s import %s" % (module_name, class_name)
    class_obj = locals()[class_name]
    return class_obj

That does it's job however is dirty because of usage of exec and possibility of manipulating execution by malicious preparation of fixtures. How should it be done properly?


Solution

  • Note that the danger of using exec in a function is that it often allows an attacker to supply bogus values which will cause your function to "accidentally" execute whatever code the attacker wants. Here you're directly writing a function that allows precisely that! Using exec doesn't make it much worse. The only difference is that without exec they have to figure out how to get their code into a file on python's import path.

    That doesn't mean you shouldn't do it. Just be aware of what you're doing. Plugin frameworks inherently have this problem; the whole point of making a framework extensible at runtime is that you want whoever can configure the plugins to be able to execute whatever code they like inside your program. If your program will be used in an environment where the the end users are not the same people who are configuring the plugins, make sure you treat _resolve_class the same way you treat exec; don't allow users to enter strings which you directly pass to _resolve_class!

    Now, that aside, you can avoid the use of exec quite easily. Python has a built-in function __import__ for getting at the underlying implementation of the import mechanism. You can use it to do dynamic imports (help(__import__) was enough for me to figure out how it works to write this answer; there is also the docs if you need a bit more detail). Using that, your function could look something like:

    def _resolve_class(self, class_path):
        modulepath, classname = class_path.rsplit('.', 1)
        module = __import__(modulepath, fromlist=[classname])
        return getattr(module, classname)
    

    (Note that I've also used rsplit with a maximum number of splits to avoid having to split the module path only to rejoin it again)