Search code examples
javabatch-processingjava-ee-7jsr352

How to send emails from a Java EE Batch Job


I have a requirement to process a list of large number of users daily to send them email and SMS notifications based on some scenario. I am using Java EE batch processing model for this. My Job xml is as follows:

<step id="sendNotification">
    <chunk item-count="10" retry-limit="3">
        <reader ref="myItemReader"></reader>
        <processor ref="myItemProcessor"></processor>
        <writer ref="myItemWriter"></writer>
        <retryable-exception-classes>
            <include class="java.lang.IllegalArgumentException"/>
        </retryable-exception-classes>
    </chunk>
</step>

MyItemReader's onOpen method reads all users from database, and readItem() reads one user at a time using list iterator. In myItemProcessor, the actual email notification is sent to user, and then the users are persisted in database in myItemWriter class for that chunk.

@Named
public class MyItemReader extends AbstractItemReader {

    private Iterator<User> iterator = null;
    private User lastUser;

    @Inject
    private MyService service;

    @Override
    public void open(Serializable checkpoint) throws Exception {
        super.open(checkpoint);

        List<User> users = service.getUsers();
        iterator = users.iterator();

        if(checkpoint != null) {
            User checkpointUser = (User) checkpoint;
            System.out.println("Checkpoint Found: " + checkpointUser.getUserId());
            while(iterator.hasNext() && !iterator.next().getUserId().equals(checkpointUser.getUserId())) {
                System.out.println("skipping already read users ... ");
            }
        }
    }

    @Override
    public Object readItem() throws Exception {

        User user=null;

        if(iterator.hasNext()) {
            user = iterator.next();
            lastUser = user;
        }
        return user;
    }

    @Override
    public Serializable checkpointInfo() throws Exception {
        return lastUser;
    }
}

My problem is that checkpoint stores the last record that was executed in the previous chunk. If I have a chunk with next 10 users, and exception is thrown in myItemProcessor of the 5th user, then on retry the whole chunck will be executed and all 10 users will be processed again. I don't want notification to be sent again to the already processed users.

Is there a way to handle this? How should this be done efficiently?

Any help would be highly appreciated. Thanks.


Solution

  • I'm going to build on the comments from @cheng. My credit to him here, and hopefully my answer provides additional value in organizing and presenting the options usefully.

    Answer: Queue up messages for another MDB to get dispatched to send emails

    Background:

    As @cheng pointed out, a failure means the entire transaction is rolled back, and the checkpoint doesn't advance.

    So how to deal with the fact that your chunk has sent emails to some users but not all? (You might say it rolled back but with "side effects".)

    So we could restate your question then as: How to send email from a batch chunk step?

    Well, assuming you had a way to send emails through an transactional API (implementing XAResource, etc.) you could use that API.

    Assuming you don't, I would do a transactional write to a JMS queue, and then send the emails with a separate MDB (as @cheng suggested in one of his comments).

    Suggested Alternative: Use ItemWriter to send messages to JMS queue, then use separate MDB to actually send the emails

    With this approach you still gain efficiency by batching the processing and the updates to your DB (you were only sending the emails one at a time anyway), and you can benefit from simple checkpointing and restart without having to write complicated error handling.

    This is also likely to be reusable as a pattern across batch jobs and outside of batch even.

    Other alternatives

    Some other ideas that I don't think are as good, listed for the sake of discussion:

    Add batch application logic tracking users emailed (with ItemProcessListener)

    You could build your own list of either/both successful/failed emails using the ItemProcessListener methods: afterProcess and onProcessError.

    On restart, then, you could know which users had been emailed in the current chunk, which we are re-positioned to since the entire chunk rolled back, even though some emails have already been sent.

    This certainly complicates your batch logic, and you also have to persist this success or failure list somehow. Plus this approach is probably highly specific to this job (as opposed to queuing up for an MDB to process).

    But it's simpler in that you have a single batch job without the need for a messaging provider and a separate app component.

    If you go this route you might want to use a combination of both a skippable and a "no-rollback" retryable exception.

    single-item chunk

    If you define your chunk with item-count="1", then you avoid complicated checkpointing and error handling code. You sacrifice efficiency though, so this would only make sense if the other aspects of batch were very compelling: e.g. scheduling and management of jobs through a common interface, the ability to restart at the failing step within a job

    If you were to go this route, you might want to consider defining socket and timeout exceptions as "no-rollback" exceptions (using ) since there's nothing to be gained from rolling back, and you might want to retry on a network timeout issue.

    Since you specifically mentioned efficiency, I'm guessing this is a bad fit for you.

    use a Transaction Synchronization

    This could work perhaps, but the batch API doesn't especially make this easy, and you still could have a case where the chunk completes but one or more email sends fail.