Search code examples
javajsr352jberet

Java batch: jobContext transientUserData not passed through steps


I'm using JBeret implementation of jsr-352 specs.

This is my job configuration, in short:

<job id="expired-customer-cancellation" xmlns="http://xmlns.jcp.org/xml/ns/javaee"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
     xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/jobXML_1_0.xsd"
     version="1.0" jsl-name="job-parent" parent="job-parent">
    <step id="step1" next="step2">
        <chunk item-count="#{jobParameters['chunksize']}?:3">
            <reader ref="eccReader">
            </reader>
            <writer ref="eccWriter" />
        </chunk>
        <partition>
            <mapper ref="eccMapper">
                <properties>
                    <property name="threads" value="#{jobParameters['threads']}?:3"/>
                    <property name="records" value="#{jobParameters['records']}?:30"/>
                </properties>
            </mapper>
        </partition>
    </step>
    <step id="step2">
        <batchlet ref="eccMailBatchlet" />
    </step>
</job>

The Itemwriter class does something like this:

@Named
public class EccWriter extends AbstractItemWriter {

    @Inject
    Logger logger;

    @Inject
    JobContext jobContext;

    @Override
    public void writeItems(List<Object> list) throws Exception {
        @SuppressWarnings("unchecked")
        ArrayList<String> processed = Optional.ofNullable(jobContext.getTransientUserData()).map(ArrayList.class::cast).orElse(new ArrayList<String>());
        list.stream().map(UserLogin.class::cast).forEach(input -> {

            if (someConditions) {

                processed.add(input.getUserId());

            }
        });

        jobContext.setTransientUserData(processed); // update job transient data with processed list
    }
}

Now I'd expect to achieve the updated list when calling jobContext.getTransientUserData() on step2, instead all I get is a null value.

Furthermore, each partition has its own jobContext.transientUserData, so it will start always with null value at partition begin.

I think that jobContext could itself mislead to common errors due to its name.

What is the natural way to bring some data through entire job ?


Solution

  • This is a gap in the current API, and the "thread local" behavior can be surprising, I agree.

    One technique you can use is to use the step persistent user data instead.

    E.g. from step 1:

    StepExecution.setPersistentUserData(processed);

    Then from step 2:

    @Inject 
    JobContext ctx;
    
    List<StepExecution> stepExecs = BatchRuntime.getJobOperator().getStepExecutions(ctx.getInstanceId());
    
    // ... not shown here, but sort through stepExecs and find the one from step1.
    StepExecution step2Exec = ... ;  
    
    Serializable userData = step2Exec.getPersistentUserData()
    

    This has been noted before as an area for improvement, and should be considered for future enhancements for Jakarta Batch (the new home of the specification).