I'm new to springBatch chunking. I want to understand how reader works
here is the scenario : implementing a purging of user accounts Chunk processor : have a reader which reads all the user accounts that matches with purge criteria ,in an order. processor : for each user account based on the some calculation ,it may create a new user account and also changes current record(say mark it as purged)
question : how doe the reader work? say i have 5000 user accounts. If my chunk size is 1000
will reader reads 1000 records and then starts processor . (say processor creates another 100 new records ) ,now writer writes whatever records updated
for reading next 1000 records will the reader executes query again? how does it know where to start?
I'm using hibernate.
To answer your specific question, it depends on the ItemReader
implementation you use. If you're using the JdbcCursorItemReader
, we hold the cursor open during the entire process so we're really reading from the execution of one query. If you're using the JdbcPagingItemReader
, then where the next chunk begins is based on the pagination logic.
A couple notes:
JdbcPagingItemReader
, each query is a unique query so if you add records that meet the criteria, they will be returned as well (I'm not 100% sure what would happen if the underlying data changed while a cursor was open…it may be a function of the db itself). Typically, you'll tag the records you want to process in that batch run with some from of flag (timestamp, processing flag, etc).