Search code examples
pythonpython-rq

Is there a way to submit functions from __main__ using Python RQ


In a similar vein to this question, is there any way to submit a function defined in the same file to python-rq? @GG_Python who asked me to create a new question for this.

Usage example:

# somemodule.py
from redis import Redis
from rq import Queue

def somefunc():
    do_something()

q = Queue(connection=Redis('redis://redis'))
q.enqueue(somefunc)

Yes, I know the answer is to define somefunc in someothermodule.py and then in the above snippet from someothermodule import somefunc, but I really don't want to. Maybe I'm being too much of a stickler for form, but somefunc really belongs in the same file in which it is enqueued (in practice, somefunc takes a docker container name and spawns it). I'd really like the whole thing to be self contained instead of having two modules.

I noticed, digging through python-rq source code, that Queue.enqueue can actually take a string rather than the actual module, so I was hoping I could maybe pass somemodule.somefunc, but no so such luck. Any ideas?


Solution

  • Looking at the source, rq is just checking your function's __module__ attribute, which can be trivially changed. The question is, why does rq restrict you from enqueueing jobs from __main__? There must be some reason, and there is: the function's module must be importable by the worker. __main__ is not, because your main module is not named __main__.py on disk. See "Considerations for Jobs" toward the bottom of this page.

    Also, your script has top-level (non-definition) code in it that will be invoked anew each time it is imported by a worker, which you probably don't want to do, as it will create new queues and fill them with jobs when each worker starts, infinitely. If you want to enqueue a function in your main module, you can and should prevent this recursive behavior with an if __name__ == "__main__" guard.

    If you want to keep the function and its enqueuement in a single module, my suggestion is that you don't put any top-level code in it besides function and/or class definitions. Anything that would be top-level code, write as a function (named e.g. main()). Then write a "dummy" main module that imports your real one and kicks off the processing.

    Example:

    somemodule.py
    from redis import Redis
    from rq import Queue
    
    def somefunc():
        do_something()
    
    def main():
        q = Queue(connection=Redis('redis://redis'))
        q.enqueue(somefunc)
    
    # if the user tried to execute this module, import the right one for them.
    # somemodule is then imported twice, once as __main__ and once as somemodule,
    # which will use a little extra memory and time but is mostly harmless
    if __name__ == "__main__":
        import mainmodule
    
    mainmodule.py
    import somemodule
    somemodule.main()
    

    You could also just change the __module__ attribute of your function to the actual name of your module on disk, so that it can be imported. You can even write a decorator to do this automatically:

    from sys import modules
    from from os.path import basename, splitext
    
    def enqueueable(func):
        if func.__module__ == "__main__":
            func.__module__, _ = splitext(basename(modules["__main__"].__file__))
        return func
    
    @enqueueable
    def somefunc():
        do_something()
    
    if __name__ == "__main__":
        from redis import Redis
        from rq import Queue
    
        q = Queue(connection=Redis('redis://redis'))
        q.enqueue(somefunc)
    

    For simplicity, the decorator assumes your module is a single file importable under its filename with the .py extension stripped off. You could be using a package for your main module, in which things will get more complicated... so don't do that.