Search code examples
spring-bootspring-batchbatch-processing

Spring Batch doesn't update Job repository after force termination


I'm using SpringBoot 2.4.x app with SpringBatch 4.3.x. I've created a simple job. Where I've FlatFileItemReader which reads from CSV file. I've ImportKafkaItemWriter which writes to Kafka topic. One step where I combines these. I'm using SimpleJobLauncher and I've set ThreadPoolTaskExecutor as TasKExecutor of the JobLauncher. It is working fine as I've expected. But one resilience use case I've which is if I kill the app and then restart the app and trigger the job then it would carry on and finish the remaining job. Unfortunately it is not happening. I did further investigate and found that when I forcibly close the app SpringBatch job repository key tables look like this:

job_execution_id version job_instance_id create_time start_time end_time status exit_code exit_message last_updated job_configuration_location
1 1 1 2021-06-16 09:32:43 2021-06-16 09:32:43 STARTED UNKNOWN 2021-06-16 09:32:43

and

step_execution_id version step_name job_execution_id start_time end_time status commit_count read_count filter_count write_count read_skip_count write_skip_count process_skip_count rollback_count exit_code exit_message last_updated
1 4 productImportStep 1 2021-06-16 09:32:43 STARTED 3 6 0 6 0 0 0 0 EXECUTING 2021-06-16 09:32:50

If I manually update these tables where I set a valid end_time and status to FAILED then I can restart the job and works absolutely fine. May I know what I need to do so that Spring Batch can update those relevant repositories appropriately and I can avoid this manual steps. I can provide more information about code if needed.


Solution

  • If I manually update these tables where I set a valid end_time and status to FAILED then I can restart the job and works absolutely fine. May I know what I need to do so that Spring Batch can update those relevant repositories appropriately and I can avoid this manual steps

    When a job is killed abruptly, Spring Batch won't have a chance to update its status in the Job repository, so the status is stuck at STARTED. Now when the job is restarted, the only information that Spring Batch has is the status in the job repository. By just looking at the status in the database, Spring Batch cannot distinguish between a job that is effectively running and a job that has been killed abruptly (in both cases, the status is STARTED).

    The way to go in indeed manually updating the tables to either mark the status as FAILED to be able to restart the job or ABANDONED to abandon it. This is a business decision that you have to make and there is no way to automate it on the framework side. For more details, please refer to the reference documentation here: Aborting a Job.