Search code examples
pythonazureazure-batch

Retrieving the number of tasks in a particular state in the Azure Batch API


The Azure Batch API provides the list function, which retrieves an enumerable list of tasks in a job, which takes TaskListOptions, to, for instance, filter the tasks by state.

I would like to query the API only for the number of tasks in a particular state and the API does not provide a function for that. I can do it by downloading an enumerating all the tasks, for instance like so:

n = sum(1 for t in bsc.task.list(job.id, bm.TaskListOptions(filter="state eq 'Completed'")))

This is of course horribly slow. The OData specification does provide the $count query option, but I can't find a way to add that onto the query. Is there a way to use $count with the Batch API, or is there perhaps a completely different alternative, e.g., via raw REST queries bypassing the Batch API?


Solution

  • Updated 2017-07-31:

    You can now query the task counts for a job directly using the get_task_counts API. This will return a TaskCounts object for the specified job.

    As it appears you are using the Azure Batch Python SDK, please use azure-batch version 3.1.0 or later.

    Original Answer:

    Right now, doing a list query as you have it is the only way to accomplish counts. You can slightly optimize your query by providing a select clause where only the properties you care about are returned by the server which will reduce the amount of data transferred. This is a common ask and improvements in this space are on their way - this answer will be updated when available.

    To your other question, the language SDKs are built on top of the REST API and expose the full functionality of the REST layer.