Search code examples
celerycelery-task

Is it safe to use Celery task IDs in HTTP requests?


I am starting to use Celery in a Flask-based web application to run async tasks on the server side.

Several resources get an '/action' sub-resource to which the user/client can send a POST including a JSON-body specifying an action, for example:

curl -H "Content-Type: application/json" -X POST \
  -d '{"doPostprocessing": { "update": true}}}' \
  "http://localhost:5000/api/results/123/action"

They get a 202 ACCEPTED response with a header

Location: http://localhost:5000/api/results/123/action/8c742418-4ade-474f-8c54-55deed09b9e5

they can poll to get the final result (or get another 202 ACCEPTED if the task is still running).

The ID I am returning for the action is the celery.result.AsyncResult.id.

Is this a safe thing to do? What kind of problems do I create when passing Celery task ids directly to the public?

If not, is there a recommended way to it? Preferably one which avoids having to track the state of the tasks explicitly.


Solution

  • You will be fine using the task ID. Celery uses Kombu's uuid function, which in turn uses uuid4 by default. uuid4 is randomly generated, rather than based off mac address (which uuid1 is), so will be 'random enough'.

    The only other way would be to have an API endpoint that returns the status of all running tasks for the user. i.e. remove any task ID. But you will then remove the ability to query an individual task. Other options will effectively mask the task ID behind a different random number, so you'll have the same brute force problem.

    I'd recommend having a look through the security Stack Exchange for UUID questions (https://security.stackexchange.com/search?q=uuid). Some of these will no doubt be equivalent to what you're looking for.