I am using spring-boot , datastax-java-cassandra-connector_2.11-2.4.1.jar and java8.
I have scenario where I need to read/load the data from C* table, but this table might have million of records.
I need to load this data from C* table, is there anyway in java/spring-boot using datastax-java-cassandra-connector API I can pull the data partition by partition?
while select * from table
may work, more effective way could be to read data by token ranges with query like select * from table where token(part_key) > beginRange and token(part_key) <= endRange
. The Spark Cassandra connector works the same way - it gets the list of all available token ranges, and then fetch data from every token range, but send it directly to the node that holds this token range (as opposite to select * from table
that retrieves all data via coordinator node).
You need to be careful in calculation of the token boundaries, especially for begin & end of the full range. You can find an example of the Java code in my repository (it's too long to paste it here).