Search code examples
javanosqlcassandrahector

Inconsistent counter values between replicas in Cassandra


I've got a 3 machine Cassandra cluster using rack unaware placements strategy with a replication factor of 2.

The column family is defined as follows:

create column family UserGeneralStats  with  comparator = UTF8Type  and default_validation_class = CounterColumnType;

Unfortunately after a few days of production use I got some inconsistent values for the counters:

Query on replica 1:

[default@StatsKeyspace] list UserGeneralStats['5261666978': '5261666978'];  
Using default limit of 100
-------------------
RowKey: 5261666978
=> (counter=bandwidth, value=96545030198)
=> (counter=downloads, value=1013)
=> (counter=previews, value=10304)

Query on replica 2:

[default@StatsKeyspace] list UserGeneralStats['5261666978': '5261666978'];
Using default limit of 100
-------------------
RowKey: 5261666978
=> (counter=bandwidth, value=9140386229)
=> (counter=downloads, value=339)
=> (counter=previews, value=1321)

As the standard read repair mechanism doesn't seem to repair the values I tried to force an anti-entropy repair using nodetool repair. It did't have any effect on the counter values.

Data inspection showed that the lower values for the counters are the correct ones so I suspect that either Cassandra (or Hector which I used as API to call Cassandra from Java) retried some increments.

Any ideas how to repair the data and possibly prevent the sittuation from happening again?


Solution

  • If neither RR nor repair fixes it, it's probably a bug.

    Please upgrade to 0.8.3 (out today) and verify it's still present in that version, then you can file a ticket at https://issues.apache.org/jira/browse/CASSANDRA.