Search code examples
azureazure-sdk-.netazure-batch

Azure Batch and use cases for JobManagerTask


I am currently digging into the Azure Batch service, and I am confused about the proper use of a JobManagerTask...

...and maybe what the overall architecture of an Azure Batch application should look like. I have built the below architecture based on code samples from Microsoft found on Github .

These are my current application components.

App1 - ClusterHead

  • Creates a job (including an auto pool)
  • Defines the JobManagerTask
  • Runs on a workstation

App2 - JobManagerTask

  • Splits input data into chunks
  • Pushes chunks (unit of work) onto an input queue
  • Creates tasks (CloudTask)

App3 - WorkloadRunner

  • Pulls from the input queue
  • Executes the task
  • Pushes to the output queue

Azure Storage Account

  • Linked to Azure Batch account
  • Provides input & output queues
  • Provides a result table

Azure Durable Function

  • Implements the aggregator pattern by using DurableEntities so that I can access incoming results prematurely.
  • Gets triggered by messages in the output queue
  • Aggregates results and writes the entity to Azure Storage table

Questions

  • Is that proper use of the JobManagerTask?
  • Why do I want/need the extra binary/application package, that encapsulates the JobManagerTask?
  • Could someone please give an example of when I should prefer to use a JobManagerTask over creating the Jobs manually?

Thanks in advance!


Solution

  • Your example is an example of a how a JobManagerTask can be used, albeit as you mentioned if the data being generated is all generated by the JobManagerTask and is fixed then it could make sense to just merge it into your ClusterHead as you mentioned. In your case it just depends if you want the split and upload of your data to occur as part of the Job or run it on the workstation.

    One area JobManagerTasks excel is if the data coming in is continuous. Basically if you had a bunch of writers to a raw input queue, you could have your JobManagerTask run continuously reading from that queue and splitting the data/creating the tasks.