Search code examples
apache-sparkcassandradatastax-enterprisecassandra-3.0spark-cassandra-connector

Why read fails with cqlsh query when huge tombstones is present


I have a table with huge tombstones.So when i performed a spark job (which reads) on that specific table, it gave result without any issues. But i executed same query using cqlsh it gave me error because huge tombstones present in that table.

Cassandra failure during read query at consistency one(1 replica needed but o replicas responded ,1 failed

I know tombstones should not be there, i can run repair to avoid them , but apart from that why spark succeeded and cqlsh failed. They both use same sessions and queries.

How spark-cassandra connector works? is it different than cqlsh? Please let me know .

thank you.


Solution

  • The Spark Cassandra Connector is different to cqlsh in a few ways.

    • It uses the Java Driver and not the python Driver
    • It has significantly more lenient retry policies
    • It full table scans by breaking the request up into pieces

    Any of these items could be contributing to why it would work in SCC and not in CQLSH.