Search code examples
javaspringspring-batchspring-batch-admin

Spring Batch - where does the process run


I'm trying to wrap my head around Spring Batch, and while many tutorials show great examples of code, i feel like i'm missing how the "spring batch engine" works.

Scenario 1 - On user creation, create user at external service.

  1. Web request
  2. CreateLocalUser()
  3. launch job CreateExternalUser()

CreateExternalUser() can fail because of many reasons, so we want to be able to retry and log errors, which Spring Batch can do for us. Also it's a decoupled process that has nothing to do with the creation of our local user.

Where does the job run? Will it run in the same thread as the web request, which means the end user will have to wait for the job to finish before getting http status 200?

Imagine i have a Web server and a Batch server. I want all jobs to run on the Batch server, but the jobs themselves can be initiated from the Web server. Can Spring Batch do this? Do i need some kind of Queue that i can write to from the Webserver and Consume from the Batch server, where the actual job will begin?

Scenario 2 - Process lines in huge file, start new job for each line

  1. Read lines in huge file (1.000.000 lines)
  2. Start new job for each line using input parameters from the file.

Processing the 1.000.000 lines is quick and the 1.000.000 new jobs will more or less be started at the same time. Where does these run? Do they run async to the initial job? Will my server be able to handle running all these more or less at the same time.

Additional question: Is it possible to query Jobs based on a job input parameter. i.e. Scenario 1, i want to show the CreateExternalUser job status / error when viewing my local user with Id 1234 on my web page. CreateExternalUser job has input parameter userId: 1234


Solution

  • You have a few questions here so let's go through them one at a time:

    Where does the job run? Will it run in the same thread as the web request, which means the end user will have to wait for the job to finish before getting http status 200?

    That depends on your configuration. If you use the defaults, then yes. The job would run in the same thread and the user would be forced to wait until the job completes in order to get the 200. This obviously isn't a good idea...

    Which is why Spring Batch's SimpleJobLauncher allows you to inject a TaskExecutor. By configuring your JobLauncher to use an async TaskExecutor implementation (ThreadPoolTaskExecutor for example), the job would be executed in a different thread, allowing the controller's processing to complete.

    Obviously this is all within a single JVM, which bring us to your next question.

    I want all jobs to run on the Batch server, but the jobs themselves can be initiated from the Web server. Can Spring Batch do this? Do i need some kind of Queue that i can write to from the Webserver and Consume from the Batch server, where the actual job will begin?

    Spring Batch contains a module called Spring Batch Integration. This module provides various capabilities including using messages to launch Spring Batch Jobs. You can use this to have a remote "batch" server that you can communicate with from the web server. The communication mechanism is Spring Integration channels so any messaging option backed by SI would be supported (JMS, AMQP, REST, etc).

    Scenario 2 - Process lines in huge file, start new job for each line
    This scenario makes me think you're going down the wrong path for your design. Can you post a new question that elaborates on this use case?

    Additional question: Is it possible to query Jobs based on a job input parameter
    Job parameters are used to identify JobInstances and are fundamental to job identification. Because of this, yes, you can identify individual job runs based on the parameters.