Search code examples
java-ee-7jsr352

JEE 7 JSR 352 passing data from batchlet to a chunk-step


I have read the standard (and the javadoc) but still have some questions. My use case is simple: A batchlet fetches data from an external source and acknowledges the data (meaning that the data is deleted from the external source after acknowledgement). Before acknowledging the data the batchlet produces relevant output (in-menory-object) that is to be passed to the next chunk oriented step.

Questions:

1) What is the best practice for passing data between a batchlet and a chunk step? It seems that I can do that by calling jobContext#setTransientUserData in the batchlet and then in my chunk step I can access that data by calling jobContext#getTransientUserData.

I understand that both jobContext and stepContext are implemented in threadlocal-manner. What worries me here is the "Transient"-part. What will happen if the batchlet succeeds but my chunk-step fails? Will the "TransientUserData"-data still be available or will it be gone if the job/step is restarted? For my use case it is important that the batchlet is run just once. So even if the job or the chunk step is restarted it is important that the output data from the successfully-run-batchlet is preserved - otherwise the batchlet have to be once more. (I have already acknowledged the data and it is gone - so running the batchlet once more would not help me.)

2)Follow up question In stepContext there is a couple of methods: getPersistentUserData and setPersistentUserData. What is these method's intended usage? What does the "Persistent"-part refer to? Are these methods relevant only for partitioning?

Thank you! / Daniel


Solution

  • Transient user data is just transient, and will not be available during job restart. A job restart can happen in a different process or machine, so users cannot count on job transient from previous run being available at restart.

    Step persistent user data are those application data that the batch job developers deem necessary to save/persist for purpose of restarting, monitoring or auditing. They will be available at restart, but they are typically scoped to the current step (not across steps).

    From reading your brief descriptioin, I got the feeling that your 2 steps are too tightly coupled and you can almost consider them one single unit of work. You want them either both succeed or both fail in order to maintain your application state integrity. I think that could be the root of the problem.