Search code examples
pythonspotifyluigi

Can we limit the throughput of a luigi Task?


We have a Luigi Task that request a piece of information from a 3rd party service. We are limited on the number of call requests we can perform per minute to that API call.

Is there a way to specify on a per-Task basis how many tasks of this kind must the scheduler run per unit of time?


Solution

  • We implemented our own rate limiting in the task. Our API limit was low enough that we could saturate it with a single thread. When we received a rate limit response, we just back off and retry.

    One thing you can do is to declare the API call as a resource. You can set how many of the resource is available in the config, and then how many of the resource the task consumes as a property on the task. This will then limit you to running n of that task at a time.

    in config:

    [resources]
    api=1
    

    in code for Task:

    resources = {"api": 1}