I have a single step springbatch application. The job is as follows:
@Bean
public Job databaseCursorJob(@Qualifier("databaseCursorStep") Step exampleJobStep,
JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory.get("databaseCursorJob")
.incrementer(new RunIdIncrementer())
.flow(exampleJobStep)
.end()
.build();
}
I start the job from a springboot application. This afternoon, I attempted to add a second step to the job. Essentially as follows:
@Bean
public Job databaseCursorJob(@Qualifier("databaseCursorStep") Step exampleJobStep,
JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory.get("databaseCursorJob")
.incrementer(new RunIdIncrementer())
.flow(exampleJobStep).next(partitionStep())
.end()
.build();
}
In other words, just adding the "next(partitionStep()). However, ever since I did this, the job finishes without executing any step (see shell output below). In fact, even after removing the second step and going back to the original job, it refuses to execute the step. Before attempting to add the second step, I never once encountered this problem. I have gone so far as restarting my VM and it still skips the step. I am rather dead in the water until I resolved this. Grateful for any insights. thanks.
2020-09-01 14:49:00.260 INFO 6913 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8087 (http) with context path ''
2020-09-01 14:49:00.263 INFO 6913 --- [ main] f.p.r.Application : Started Application in 7.752 seconds (JVM running for 9.092)
2020-09-01 14:49:00.268 INFO 6913 --- [ main] o.s.b.a.b.JobLauncherCommandLineRunner : Running default command line with: []
2020-09-01 14:49:00.579 INFO 6913 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [FlowJob: [name=databaseCursorJob]] launched with the following parameters: [{}]
2020-09-01 14:49:00.698 INFO 6913 --- [ main] o.s.batch.core.job.SimpleStepHandler : Step already complete or not restartable, so no action to execute: StepExecution: id=120, version=4, name=databaseCursorStep, status=COMPLETED, exitStatus=COMPLETED, readCount=1, filterCount=0, writeCount=1 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=2, rollbackCount=0, exitDescription=
2020-09-01 14:49:00.730 INFO 6913 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [FlowJob: [name=databaseCursorJob]] completed with the following parameters: [{}] and the following status: [COMPLETED]
My issue was that my job had no way recover if there was an error or stuck in an unknown state. The step was not "already complete", it never completed. Its status was still "STARTED", and exit code "UNKNOWN" because it never exited. Anyway, my job repository is not in memory, but captured to a local DB, which is why it never resolved itself even after restarting VM (shame on me for not remembering this). So, I was able to fix by wiping out the job instance history, however that was a band-aid. I still have to fix my code to prevent it from happening again.
I also learned I could diagnose by examining the job repository in the database (its all there).
I really resolved this thanks Mr Hassine who responded above several times and pointed me in the right direction. The solution to prevent in the future is indeed addressed in the link he provided in his first response: Spring Batch error (A Job Instance Already Exists) and RunIdIncrementer generates only once