apache-spark cassandra datastax-enterprise cassandra-3.0 spark-cassandra-connector

Why read fails with cqlsh query when huge tombstones is present

I have a table with huge tombstones.So when i performed a spark job (which reads) on that specific table, it gave result without any issues. But i executed same query using cqlsh it gave me error because huge tombstones present in that table.

Cassandra failure during read query at consistency one(1 replica needed but o replicas responded ,1 failed

I know tombstones should not be there, i can run repair to avoid them , but apart from that why spark succeeded and cqlsh failed. They both use same sessions and queries.

How spark-cassandra connector works? is it different than cqlsh? Please let me know .

thank you.

Solution

The Spark Cassandra Connector is different to cqlsh in a few ways.

It uses the Java Driver and not the python Driver
It has significantly more lenient retry policies
It full table scans by breaking the request up into pieces

Any of these items could be contributing to why it would work in SCC and not in CQLSH.