I am running Custom code activity in ADF v2 using Batch Service. Whenever this runs it only create one CloudTask within my Batch Job although I have more than two dozen parallel.Invoke methods running. Is there a way I can create multiple Tasks from one Custom Activity from ADF so that the processing can spread across all nodes in Batch Pool
I have fixed Pool with two nodes. Max Tasks are also set to 8 per node and Scheduling policy is also set to "Spread". I have only one Custom Task on my pipeline with Multiple Parallel.Invoke (Almost two Dozen).I was hoping this will create Multiple CloudTasks and will be spread Across both of my nodes as both nodes are single core. Looks like when each Custom Activity runs in ADF, it creates only one Task (CloudTask) for Batch Service.
My other hope was to use
https://learn.microsoft.com/en-us/azure/batch/tutorial-parallel-dotnet
and manually create CloudTasks in my console application and create Multiple Tasks Programatically and then run that Console Application with ADF Custom Activity but CloudTask takes JobId and Cmd. Wanted to something like following but instead of passing taskCommandLine, I wanted to pass a C# method name and parameters to execute
string taskId = "task" + i.ToString().PadLeft(3, '0');
string taskCommandLine = "ping -n " + rand.Next(minPings, maxPings +
1).ToString() + " localhost";
CloudTask task = new CloudTask(taskId, taskCommandLine);
// Wanted to do CloudTask task = new CloudTask(taskId,
SomeMethod(args));
tasks.Add(task);
Also it looks like we can't create CloudTasks by using .NET API for Batch within Custom Activity of ADF
What I wanted to Achieve?
I have data in SQL Server table and I want to run different transformations on it by slicing it Horizontally or Vertically (by picking rows or columns). I want to run those transformations in Parallel (wants to have multiple CloudTask instances so that each one can operate on a specific Column Independently and after transformation load it into a different table). But the issue is it looks like we can't use .NET Batch Service API within ADF and the only way seems to be having multiple Custom Activities in my Data Factory pipeline.
Application needs to deployed on each and every node within Batch pool and CloudTasks needs to be created by calling the application with cmd
CloudTask task =
new CloudTask(
"MyTask",
"cmd /c %AZ_BATCH_APP_PACKAGE_MyTask%\\myTask.exe -args -here");