I got this bug when trying to submit app from master node:
dse -u abc -p abc spark-submit --confpark.cassandra.auth.username=abc --conf spark.cassandra.auth.password=abc --conf spark.debug.maxToStringFields=10000 --conf spark.executor.memory=4G app.py
Im using 3 dse analytics node, 1 datacenter, 4 core/16gb ram node and submit app from master node. When I go to check tasks/stages I saw this bug:
Does everybody have even seen this bug?
You have a problem with your application that writes data into your tables - either it deletes a lot of data, or (most probably) it inserts nulls as part of "normal" inserts - in this case the tombstones are generated, and if you have a lot of them, the query are starting to fail.
What you can do:
--conf spark.cassandra.output.ignoreNulls=true
- this will prevent writing nulls, but this may not work very well with overwriting existing data. If you're using other driver(s), use unset
for fields that have null value;gc_grace_period
, but this comes with its own challenges - I recommend to read this article for better understanding of the problem.