Search code examples
javaspringspring-bootcronquartz-scheduler

Java Springboot Job Flow to execute two steps simultaneously


I want to write a Batch job which has steps (let's say A, B, C, D). The batch should execute in following way:

   A
   |
B and C
   |
   D

But current setup is: A -> B -> C -> D. I want execute B and C at the same time and different threads.

Both B and C execution duration is different (lets say duration for B is 5 min and for C it is 10 min)

More details:

Step A: Truncates the some table in database.

Step B and C: Batch read and writer with chunks of 2000.

Reader: JdbcCursorItemReader | Processor: Not using | Writer: JdbcBatchItemWriter

Step D: Run PL/SQL procedure which depends on data writes in Step B and C so this should start only after B & C successfully complete.

I'm using Quartz scheduler to running this job. So also let me know if I need to use multithread to run this job, because currently it is running in only one worker thread.

Please write the code also.

Current Setup of Job:

public Job demoJob(){
return jobBuilderFactory.get("demoJob")
.listener(jobExecutionListener())
.start(A())
.next(B())
.next(C())
.next(D())
.build();
}

Expectation:

public Job demoJob(){
return jobBuilderFactory.get("demoJob")
.listener(jobExecutionListener())
.start(A())
.next(B() and C())                        // How to setup this
.next(D())
.build();
}

Solution

  • As long as the application logic that needs to be parallelized can be split into distinct responsibilities and assigned to individual steps, it can be parallelized in a single process. Parallel Step execution is easy to configure and use.

    When using Java configuration, executing steps (step1,step2) in parallel with step3 is straightforward, as follows:

    @Bean
    public Job job(JobRepository jobRepository) {
        return new JobBuilder("job", jobRepository)
            .start(splitFlow())
            .next(step4())
            .build()        //builds FlowJobBuilder instance
            .build();       //builds Job instance
    }
    
    @Bean
    public Flow splitFlow() {
        return new FlowBuilder<SimpleFlow>("splitFlow")
            .split(taskExecutor())
            .add(flow1(), flow2())
            .build();
    }
    
    @Bean
    public Flow flow1() {
        return new FlowBuilder<SimpleFlow>("flow1")
            .start(step1())
            .next(step2())
            .build();
    }
    
    @Bean
    public Flow flow2() {
        return new FlowBuilder<SimpleFlow>("flow2")
            .start(step3())
            .build();
    }
    
    @Bean
    public TaskExecutor taskExecutor() {
        return new SimpleAsyncTaskExecutor("spring_batch");
    }
    

    The configurable task executor is used to specify which TaskExecutor implementation should execute the individual flows. The default is SyncTaskExecutor, but an asynchronous TaskExecutor is required to run the steps in parallel. Note that the job ensures that every flow in the split completes before aggregating the exit statuses and transitioning.

    Based on: https://docs.spring.io/spring-batch/docs/current/reference/html/scalability.html#scalabilityParallelSteps