Search code examples
pythoncelery

global variable in a task


I am using celery director to write celery task. And I need to pass parameter from one step to downstream step. So I write the following decorator to do that. But the decorator need to access a parameter named params defined in the task, so I used a global variable (global params). The question is: will the global variable result in case condition? Suppose the first execution set params to {"test":{"v1":1}}, and the second execution set params to {"test":{"v1":2}} because they pass different kwargs. If they are running at nearly t he same time, is it for both execution see the same params value since they are global?

def inject_params(func):
    def inner(*args, **kwargs):
        inherit_params = {}
        for arg in args:
            if isinstance(arg, dict):
                if "params" in arg:
                    inherit_params = arg["params"] |inherit_params
        new_kwargs = inherit_params|kwargs
        result = func(*args, **new_kwargs)
        global params
        return {"params": params|inherit_params, "result": result}
        return result
    inner.__name__ = func.__name__
    return inner

params={}


@task(name="EVALUATE_DELETE_PERFORMANCE")
@inject_params
def evaluate_delete_performance(*args, **kwargs):
   
    global params
    params = {"test":kwargs}
   
    return "some value"

Solution

  • Global variable(params) will contain inconsistent data

    Why? Let's imagine that we have X workers. Scheduler started first task on first worker. A worker doesn't stop processing at the moment of writing / reading global variables. You can easily check this with a regular counter(just an example):

    tasks.py:

    from celery import Celery
    
    app = Celery('tasks', broker='redis://localhost')
    
    COUNTER = 0
    
    @app.task
    def plus_one(name: str):
        global COUNTER
        COUNTER += 1
        print(f'name: {name}. counter: {COUNTER}')
    

    run_task.py:

    from tasks import plus_one
    
    for i in range(1000):
        plus_one.delay('first')
        plus_one.delay('second')
    
    • run celery from first terminal ( celery -A tasks worker )
    • run tasks from second terminal ( python run_task.py )

    You'll see a different counter value. Global variables do not solve race condition or concurrency problems. In general, I think that it is better to use them only for constants / static flags / etc.

    By the way, similar question