Search code examples
google-app-enginetask-queue

GAE Task Queues with ETA and large number of tasks


In my app, I need to send emails to a large number of users when an event happens. I'd like to send those emails out gradually instead of all at once. For clarity in explaining, let's say I need to send out emails to 10,000 users.

I currently do this with a task queue with a maximum rate of 1 task/second. I enqueue 10,000 tasks in batches, and the emails get sent out at a rate of 1/second.

I'd like to change this to using an ETA for the tasks instead of limiting the task queue to a maximum rate. Conceptually it would be like this (except that task submission would be batched):

now = datetime.utcnow()
for i, email in enumerate(email_list):
    eta = now + datetime.timedelta(seconds=i)
    deferred.defer(send_email, email, _eta=eta)

Before implementing a change like this, I'd like to have some confidence that GAE can do this efficiently.

If I have 10,000 tasks in a task queue, each with a different ETA, will the GAE task queue be able to efficiently monitor all the tasks and start them at approximately (the precise ETA isn't important) the appropriate times? I don't know what algorithm Google uses for this.

EDIT:

Imagine if you inserted a billion tasks in a single day each with an ETA. How would GAE monitor those tasks to make sure they got fired off at the right time? Polling all the tasks at some time interval (e.g., every minute) would be a terrible solution. Perhaps GAE uses some kind of priority queue. It would be nice to have some confidence that GAE has implemented an algorithm that will scale for a lot of tasks with an ETA.


Solution

  • With the stated daily quota of 10 billion tasks one would think they should be able to handle 10,000 of them :)

    In my current project I'm also sending ~10,000 emails (SendGrid) with tasks & _eta (although in batches of 25) which works fine so far...