spark-cassandra-connector configuration: concurrent.reads vs input.reads_per_sec

concurrent.reads: Sets read parallelism for joinWithCassandra tables.

input.reads_per_sec: Sets max requests per core per second for joinWithCassandraTable

Concurrent reads set to 4 means in a 4 core spark executor means, 16 requests will run MAX at the same time.

looks like concurrent.reads does the same thing as input.reads_per_sec.

what is the true difference between them?

Solution

They are not the same, but could be treated as related...

concurrent.reads defines how many simultaneous requests per core could be sent simultaneously (so-called in-flight requests). In some cases you can lower it from default to avoid overload of Cassandra nodes from handling too many requests in parallel;
input.reads_per_sec defines how many requests per core per second could be executed.