Search code examples
cassandradatastaxbitnami

Not marking nodes down due to local pause of 8478595263 > 5000000000


i have 3 node cassandra cluster in kubernetes. Deployed cassandra using bitnami/cassandra helm chart.

getting error based on more number of request after sometime later

WARN  [GossipTasks:1] 2020-01-09 11:39:33,070 FailureDetector.java:278 - Not marking nodes down due to local pause of 8206335128 > 5000000000
WARN  [GossipTasks:1] 2020-01-09 11:39:42,238 FailureDetector.java:278 - Not marking nodes down due to local pause of 6668041401 > 5000000000
WARN  [GossipTasks:1] 2020-01-09 11:40:03,341 FailureDetector.java:278 - Not marking nodes down due to local pause of 15041441083 > 5000000000
WARN  [PERIODIC-COMMIT-LOG-SYNCER] 2020-01-09 11:41:55,606 NoSpamLogger.java:94 - Out of 1 commit log syncs over the past 0.00s with average duration of 11850.79ms, 1 have exceeded the configured commit interval by an average of 1850.79ms
WARN  [GossipTasks:1] 2020-01-09 11:42:20,019 Gossiper.java:783 - Gossip stage has 1 pending tasks; skipping status check (no nodes will be marked down)
NFO  [RequestResponseStage-1] 2020-01-09 11:45:36,329 Gossiper.java:1011 - InetAddress /100.96.7.7 is now UP
INFO  [RequestResponseStage-1] 2020-01-09 11:45:36,330 Gossiper.java:1011 - InetAddress /100.96.7.7 is now UP
INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,931 MessagingService.java:1236 - MUTATION messages were dropped in last 5000 ms: 0 internal and 45 cross node. Mean internal dropped latency: 0 ms and Mean cross-node dropped latency: 2874 ms
INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,933 StatusLogger.java:47 - Pool Name                    Active   Pending      Completed   Blocked  All Time Blocked
INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,949 StatusLogger.java:51 - MutationStage                     0         0         226236         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,950 StatusLogger.java:51 - ViewMutationStage                 0         0              0         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,950 StatusLogger.java:51 - ReadStage                         0         0         244468         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,951 StatusLogger.java:51 - RequestResponseStage              0         0         341270         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,952 StatusLogger.java:51 - ReadRepairStage                   0         0           5395         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,953 StatusLogger.java:51 - CounterMutationStage              0         0              0         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,958 StatusLogger.java:51 - MiscStage                         0         0              0         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,959 StatusLogger.java:51 - CompactionExecutor                0         0         686641         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,960 StatusLogger.java:51 - MemtableReclaimMemory             0         0            689         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,962 StatusLogger.java:51 - PendingRangeCalculator            0         0              9         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,964 StatusLogger.java:51 - GossipStage                       0         0        3093860         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,966 StatusLogger.java:51 - SecondaryIndexManagement          0         0              0         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,970 StatusLogger.java:51 - HintsDispatcher                   0         0             10         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,973 StatusLogger.java:51 - MigrationStage                    0         0              6         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,973 StatusLogger.java:51 - MemtablePostFlush                 0         0            717         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,974 StatusLogger.java:51 - PerDiskMemtableFlushWriter_0         0         0            689         0                 0

:



INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,974 StatusLogger.java:51 - PerDiskMemtableFlushWriter_0         0         0            689         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,974 StatusLogger.java:51 - ValidationExecutor                0         0              0         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,975 StatusLogger.java:51 - Sampler                           0         0              0         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,975 StatusLogger.java:51 - MemtableFlushWriter               0         0            689         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,976 StatusLogger.java:51 - InternalResponseStage             0         0            869         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,977 StatusLogger.java:51 - AntiEntropyStage                  0         0              0         0                 0

INFO  [ScheduledTasks:1] 2020-01-09 11:45:55,978 StatusLogger.java:51 - CacheCleanupExecutor              0         0              0         0                 0
INFO 
INFO  [Service Thread] 2020-01-09 12:11:49,292 GCInspector.java:284 - ParNew GC in 659ms.  CMS Old Gen: 2056877512 -> 2057740336; Par Eden Space: 671088640 -> 0; Par Survivor Space: 2636992 -> 6187520

Tried to solved based on some of the reference issue but not given for kubernetes Cassandra Error message: Not marking nodes down due to local pause. Why?

Pool Name                         Active   Pending      Completed   Blocked  All time blocked
ReadStage                              0         0         245904         0                 0
MiscStage                              0         0              0         0                 0
CompactionExecutor                     0         0         696906         0                 0
MutationStage                          0         0         244820         0                 0
MemtableReclaimMemory                  0         0            697         0                 0
PendingRangeCalculator                 0         0              9         0                 0
GossipStage                            0         0        3138625         0                 0
SecondaryIndexManagement               0         0              0         0                 0
HintsDispatcher                        0         0             10         0                 0
RequestResponseStage                   0         0         364305         0                 0
Native-Transport-Requests              0         0       11089339         0               241
ReadRepairStage                        0         0           5395         0                 0
CounterMutationStage                   0         0              0         0                 0
MigrationStage                         0         0              6         0                 0
MemtablePostFlush                      0         0            725         0                 0
PerDiskMemtableFlushWriter_0           0         0            697         0                 0
ValidationExecutor                     0         0              0         0                 0
Sampler                                0         0              0         0                 0
MemtableFlushWriter                    0         0            697         0                 0
InternalResponseStage                  0         0            869         0                 0
ViewMutationStage                      0         0              0         0                 0
AntiEntropyStage                       0         0              0         0                 0
CacheCleanupExecutor                   0         0              0         0                 0

Message type           Dropped
READ                         0
RANGE_SLICE                  0
_TRACE                       0
HINT                         0
MUTATION                    45
COUNTER_MUTATION             0
BATCH_STORE                  0
BATCH_REMOVE                 0
REQUEST_RESPONSE             0
PAGED_RANGE                  0
READ_REPAIR                  0



Solution

  • From the above "tpstats" metrics looks okay but we can see some mutation there so it indicates that you cluster is going overload. Some requests blocked there too. Commitlogs seem not accepting the write there. You should plan cluster expansion or start debugging why nodes are overloading.