Search code examples
spring-batchspring-integrationspring-batch-integration

How to configure channels and AMQ for spring-batch-integration where all steps are run as slaves on another cluster member


Followup to Configuration of MessageChannelPartitionHandler for assortment of remote steps

Even though the first question was answered (I think well), I think I'm confused enough that I'm not able to ask the right questions. Please direct me if you can.

Here is a sketch of the architecture I am attempting to build. Today, we have a job that runs a step across the cluster that works. We want to extend the architecture to run n (unbounded and different) jobs with n (unbounded and different) remote steps across the cluster.

I am not confusing job executions and job instances with jobs. We already run multiple job instances across the cluster. We need to be able to run other processes that are scalable in hte same way as the one we have defined already.

enter image description here

The source data is all coming from database which are known to the steps. The partitioner is defining the range of data for the "where" clause in the source database and putting that in the stepContext. All of the actual work happens in the stepContext. The jobContext simply serves to spawn steps, wait for completion, and provide the control API.

There will be 0 to n jobs running concurrently, with 0 to n steps from however many jobs running on the slave VM's concurrently.

Does each master job (or step?) require its own request and reply channel, and by extension its own OutboundChannelAdapter? Or are the request and reply channels shared?

Does each master job (or step?) require its own aggregator? By implication this means each job (or step) will have its own partition handler (which may be supported by the existing codebase)

The StepLocator on the slave appears to require a single shared replyChannel across all steps, but it appears to me that the messageChannelpartitionHandler requires a separate reply channel per step.

What I think is unclear (but I can't tell since it's unclear) is how the single reply channel is picked up by the aggregatedReplyChannel and then returned to the correct step. Of course I could be so lost I'm asking the wrong questions.

Thank you in advance


Solution

  • Does each master job (or step?) require its own request and reply channel, and by extension its own OutboundChannelAdapter? Or are the request and reply channels shared?

    No, there is no need for that. StepExecutionRequests are identified with a correlation Id that makes it possible to distinguish them.

    Does each master job (or step?) require its own aggregator? By implication this means each job (or step) will have its own partition handler (which may be supported by the existing codebase)

    That should not be the case, as requests are uniquely identified with a correlation ID (similar to the previous point).

    The StepLocator on the slave appears to require a single shared replyChannel across all steps, but it appears to me that the messageChannelpartitionHandler requires a separate reply channel per step.

    The messageChannelpartitionHandler should be step or job scoped, as mentioned in the Javadoc (see recommendation in the last note). As a side note, there was an issue with message crossing in a previous version due to the reply channel being instance based, but it was fixed here.