Search code examples
celeryplotly-dash

Dash Celery setup


I have docker-compose setup for my Dash application. I need suggestion or preferred way to setup my celery image.

I am using celery for following use-cases and these are cancellable/abortable/revoked task:

  • Upload file
  • Model training
  • Create train, test set

Case-1. Create one service as celery, command: ["celery", "-A", "tasks", "worker", "--loglevel=INFO", "--pool=prefork", "--concurrency=3", "--statedb=/celery/worker.state"] So, here we are using default queue, single worker (main) and 3 child/worker processes(ie can execute 3 tasks simultaneously) Now, if I revoke any task, will it kill the main worker or just that child worker processes executing that task?

Case-2. Create three services as celery-{task_name} ie celery-upload etc, command: ["celery", "-A", "tasks", "worker", "--loglevel=INFO", "--pool=prefork", "--concurrency=1", , "--statedb=/celery/worker.state", "--queues=upload_queue", , "--hostname=celery_worker_upload_queue"] So, here we are using custom queue, single worker (main) and 1 child/worker processe(ie can execute 1 task) in its container. This way one service for each task. Now, if I revoke any task, it will only kill the main worker or just the only child worker processes executing that task in respective container and rest celery containers will be alive?

I tried using below signals with command task.revoke(terminate=True)

  • SIGKILL and SIGTERM In this, I observed @worker_process_shutdown.connect and @task_revoked.connect both gets fired. Does this means main worker and concerned child worker process for whom revoke command is issued(or all child processes as main worker is down) are down?
  • SIGUSR1 In this, I observed only @task_revoked.connect gets fired. Does this means main worker is still running/alive and only concerned child worker process for whom revoke command is issued is down?

Which case is preferred? Is it possible to combine both cases? ie having single celery service with individual workers(main) and individual child worker process and individual queues Or having single celery service with single worker (main), individual/dedicated child worker processes and individual queues for respective tasks?

One more doubt, As I think, using celery is required for above listed tasks, now say I have button for cleaning a dataframe will this too requires celery? ie wherever I am dealing with dataframes should I need to use celery?

Please suggest.

UPDATE-2 worker processes = child-worker-process

This is how I am using as below

# Start button
result = background_task_job_one.apply_async(args=(n_clicks,), queue="upload_queue")
# Cancel button
result = result_from_tuple(data, app=celery_app)
result.revoke(terminate=True, signal=signal.SIGUSR1)
# Task
@celery_app.task(bind=True, name="job_one", base=AbortableTask)
def background_task_job_one(self, n_clicks):
    msg = "Aborted"
    status = False
    
    try:
        msg = job(n_clicks) # Long running task
        status = True           
    except SoftTimeLimitExceeded as e:
        self.update_state(task_id=self.request.id, state=states.REVOKED)        
        msg = "Aborted"
        status = True
        raise Ignore()
    finally:
        print("FINaLLY")
    return status, msg

Is this way ok to handle cancellation of running task? Can you elaborate/explain this line [In practice you should not send signals directly to worker processes.] Just for clarification from line [In prefork concurrency (the default) you will always have at least two processes running - Celery worker (coordinator) and one or more Celery worker-processes (workers)]

This means celery -A app worker -P prefork -> 1 main worker and 1 child-worker-process. Is it same as below celery -A app worker -P prefork -c 1 -> 1 main worker and 1 child-worker-process

Earlier, I tried using class AbortableTask and calling abort(), It was successfully updating the state and status as ABORTED but task was still alive/running.

I read to terminate currently executing task, it is must to pass terminate=True. This is working, the task stops executing and I need to update task state and status manually to REVOKED, otherwise default PENDING. The only hard-decision to make is to use SIGKILL or SIGTERM or SIGUSR1. I found using SIGUSR1 the main worker process is alive and it revoked only the child worker process executing that task.

Also, luckily I found this link I can setup single celery service with multiple dedicated child-worker-process with its dedicated queues.

Case-3: Celery multi

  1. command: ["celery", "multi", "show", "start", "default", "model", "upload", "-c", "1", "-l", "INFO", "-Q:default", "default_queue", "-Q:model", "model_queue", "-Q:upload", "upload_queue", "-A", "tasks", "-P", "prefork", "-p", "/proj/external/celery/%n.pid", "-f", "/proj/external/celery/%n%I.log", "-S", "/proj/external/celery/worker.state"] But getting error, celery service exited code 0

  2. command: bash -c "celery multi start default model upload -c 1 -l INFO -Q:default default_queue -Q:model model_queue -Q:upload upload_queue -A tasks -P prefork -p /proj/external/celery/%n.pid -f /proj/external/celery/%n%I.log -S /proj/external/celery/worker.state" Here also getting error,

celery | Usage: python -m celery worker [OPTIONS]
celery | Try 'python -m celery worker --help' for help.
 
celery | Error: No such option: -p
celery | * Child terminated with exit code 2
celery | FAILED

Some doubts, what is preferred 1 worker vs multi worker? If multi worker with dedicated queues, creating docker service for each task increases the docker-file and services too. So I am trying single celery service with multiple dedicated child-worker-process with its dedicated queues which is easy to abort/revoke/cancel a task.

But getting error with case-3 i.e. celery multi. Please suggest.


Solution

  • If you revoke a task, it may terminate the working process that was executing the task. The Celery worker will continue working as it needs to coordinate other worker processes. If the life of container is tied to the Celery worker, then container will continue running. In practice you should not send signals directly to worker processes.

    In prefork concurrency (the default) you will always have at least two processes running - Celery worker (coordinator) and one or more Celery worker-processes (workers).

    To answer the last question we may need more details. It would be easier if you could run Celery task when all dataframes are available. If that is not the case, then perhaps run individual tasks to process dataframes. It is worth having a look at Celery workflows and see if you can build Chunk-ed workflow. Keep it simple, start with assumption that you have all dataframes available at once, and build from there.