Search code examples
javaspring-cloud-dataflowspring-cloud-task

Spring Cloud Data Flow : Unable to launch multiple instances of the same Task


TL;DR

Spring Cloud Data Flow does not allow multiple executions of the same Task even though the documentation says that this is the default behavior. How can we allow SCDF to run multiple instances of the same task at the same time using the Java DSL to launch tasks? To make things more interesting, launching of the same task multiple times works fine when directly hitting the rest enpoints using curl for example.

Background :

I have a Spring Cloud Data Flow Task that I have pre-registered in the Spring Cloud Data Flow UI Dashboard

@SpringBootApplication
@EnableTask
public class TaskApplication implements ApplicationRunner {

    private static final Logger LOGGER = LoggerFactory.getLogger(TaskApplication.class);

    public static void main(String[] args) {
        SpringApplication.run(TaskApplication.class, args);
    }

    @Override
    public void run(ApplicationArguments args) throws InterruptedException {
        //Some application code
    }
    
}

I am launching this task using the Task Java DSL in some other main program :

URI dataFlowUri = URI.create(scdfUri);
DataFlowOperations dataFlowOperations = new DataFlowTemplate(dataFlowUri);
Task task = Task.builder(dataFlowOperations).name("Task1").definition("a:task1")
                .description("Task launched from DSL").build();         
long executionId = task.launch(new ArrayList<>());

This works perfectly fine the first time; however, when I try to rerun the above code, I get the following exception in the above program :

[main] DEBUG org.springframework.web.client.RestTemplate - Response 409 CONFLICT

The SCDF server logs show a similar issue :

2021-05-12 15:12:31.387  WARN 1 --- [nio-9393-exec-3] o.s.c.d.s.c.RestControllerAdvice         : Caught exception while handling a request: Cannot register task Task1 because another one has already been registered with the same name

The interesting part is that I am able to launch multiple instances of the same task if I use the curl command as shown in the API guide

Question :

  1. As per the SCDF Task documentation, a Task can be rerun by default. There is no additional configuration required to be able to rerun the same task; however, it looks like the opposite is true. SCDF does not allow a task to be rerun by default?

  2. I tried adding Spring integration jars as suggested and explicitly set the spring.cloud.task.single-instance-enabled property to false but started running into NoClassDefFoundError related issues. I also tried passing this property to the Task.launch method but that did not solve the issue.

  3. Why can the same task be relaunched multiple times when using the curl command but cannot be relaunched multiple times when using the Java DSL?


Solution

  • In this case it looks like you are trying to recreate the task definition. You should only need to create the task definition once. From this definition you can launch multiple times. For example:

    URI dataFlowUri = URI.create(scdfUri);
    DataFlowOperations dataFlowOperations = new DataFlowTemplate(dataFlowUri);
    Task task = Task.builder(dataFlowOperations).name("Task1").definition("a:task1")
                    .description("Task launched from DSL").build();         
    long executionId = task.launch(new ArrayList<>());
    executionId = task.launch(new ArrayList<>());
    executionId = task.launch(new ArrayList<>());