I am new to the batch processing world and I am trying to solve the below mentioned problem using Spring Batch. I am really struggling at how to create multiple step batch job out of it.
Given
A csv file having records for multiple students
studentId | subject1_score | subject2_score | subject3_score | result |
---|---|---|---|---|
1 | 59 | 51 | 54 | PENDING |
2 | 79 | 20 | 76 | PENDING |
We have a REST endpoint which take students marks in all subjects and return result (pass/fail) for each student. Pass/fail logic is defined in the given rest endpoint.
TODO
Read the batch of records out of that csv, make a REST call per batch which updates the result on the basis of marks in all three subjects for each student. Update the result for each student and generate the output csv for all the records.
Class StudentMarksheet {
String studentId;
Integer subject1_score;
Integer subject2_score;
Integer subject3_score;
String result;
...
}
Class GenerateResultRequestResponseDto {
Long batchId
List<StudentMarksheet> students;
...
}
studentId | subject1_score | subject2_score | subject3_score | result |
---|---|---|---|---|
1 | 59 | 51 | 54 | PASS |
2 | 79 | 20 | 76 | FAIL |
Update on Requirement
We can receive either a csv or an xml file. Based on the file type we have two different reader and writer (one for reading and writing csv file and one for xml file type).
My Design solution
Read single record and create a StudentMarksheet object from it -> processor decided where we have a valid record or not -> writer prepares the GenerateResultRequestResponseDto, execute the rest call for 1 batch of records and write it to csv file.
Big question here is do I make two jobs, one for CSV & other for XML?
Since you REST endpoint accepts a list of students that you need to process in chunks just before writing them to the file, you can use an ItemWriteListener#beforeWrite(List)
and make your call in there. This listener is the first extension point where get a list of items. So your chunk-oriented step could be designed as follows: