Search code examples
apache-beamapache-beam-io

Considering total max records from the user and processing it based on the batch size in apache beam


I am trying to read the records from the source based on the count of total max records to be processed which should be given by the user.

Eg: Total Records in the source table is 1 million Total Max records to process are 100K

I need to process those 100k records only from source. I have gone through JDBC IO library classes to check if I have any option to implement it like there is an option to set the batch size, but I have found none.

PS: I want to implement it IO level, Not by adding limit to query


Solution

  • I was able to do it using with setMaxRows by turning off the auto-commit for JDBC IO