Cassandra ERROR: LEAK DETECTED

I am having a two node Cassandra cluster( Version : 3.10). I tried to read data from SQL and write into Cassandra. Everything was Ok, but at some point I got this error -

Exception in thread "main" com.datastax.driver.core.exceptions.TransportException: [/192.168.22.231:9042] Connection has been closed
        at com.datastax.driver.core.exceptions.TransportException.copy(TransportException.java:38)
        at com.datastax.driver.core.exceptions.TransportException.copy(TransportException.java:24)
        at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
        at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
        at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:68)
        at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:43)
        at TableCreator.insertDataToCassandra(TableCreator.java:1037)
        at TableCreator.createTable(TableCreator.java:356)
        at DbMigration.main(DbMigration.java:25)
Caused by: com.datastax.driver.core.exceptions.TransportException: [/192.168.22.231:9042] Connection has been closed
        at com.datastax.driver.core.Connection$ConnectionCloseFuture.force(Connection.java:1215)
        at com.datastax.driver.core.Connection$ConnectionCloseFuture.force(Connection.java:1200)
        at com.datastax.driver.core.Connection.defunct(Connection.java:450)

Nodetool status shows both the nodes are UP and healthy -

Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Tokens       Owns (effective)  Host ID                               Rack
UN  192.168.22.229  77.39 GiB  256          100.0%            335fc3a2-c21f-44ad-a937-487ba457c2fa  rack1
UN  192.168.22.231  77.39 GiB  256          100.0%            a5eaf96c-eaf9-4e2e-bd6b-6186ce944cd0  rack1

Now I am not able to connect to my first node -

cqlsh --connect-timeout=30 192.168.22.231
Connection error: ('Unable to connect to any servers', {'192.168.22.231': error(111, "Tried connecting to [('192.168.22.231', 9042)]. Last error: Connection refused")})

But I can connect to second server. I tried to check system.log and debug.log syatem.log

INFO [CompactionExecutor:4003] 2017-06-19 07:25:19,676 NoSpamLogger.java:91 - Maximum memory usage reached (1.000GiB), cannot allocate chunk of 1.000MiB INFO [IndexSummaryManager:1] 2017-06-19 07:28:02,222 IndexSummaryRedistribution.java:75 - Redistributing index summaries INFO [CompactionExecutor:4009] 2017-06-19 07:40:42,023 NoSpamLogger.java:91 - Maximum memory usage reached (1.000GiB), cannot allocate chunk of 1.000MiB INFO [CompactionExecutor:4015] 2017-06-19 07:56:04,582 NoSpamLogger.java:91 - Maximum memory usage reached (1.000GiB), cannot allocate chunk of 1.000MiB INFO [CompactionExecutor:4021] 2017-06-19 08:11:26,674 NoSpamLogger.java:91 - Maximum memory usage reached (1.000GiB), cannot allocate chunk of 1.000MiB INFO [Service Thread] 2017-06-19 08:21:48,726 GCInspector.java:284 - ConcurrentMarkSweep GC in 225ms. CMS Old Gen: 5813642680 -> 3194404296; Par Eden Space: 10360904 -> 347594616; Par Survivor Space: 83886080 -> 42514752

INFO [CompactionExecutor:4027] 2017-06-19 08:26:49,414 NoSpamLogger.java:91 - Maximum memory usage reached (1.000GiB), cannot allocate chunk of 1.000MiB INFO [IndexSummaryManager:1] 2017-06-19 08:28:02,341 IndexSummaryRedistribution.java:75 - Redistributing index summaries INFO [CompactionExecutor:4031] 2017-06-19 08:42:12,733 NoSpamLogger.java:91 - Maximum memory usage reached (1.000GiB), cannot allocate chunk of 1.000MiB INFO [Service Thread] 2017-06-19 08:52:33,145 GCInspector.java:284 - ConcurrentMarkSweep GC in 215ms. CMS Old Gen: 5868761104 -> 3186639968; Par Eden Space: 9853592 -> 423279080; Par Survivor Space: 83886080 -> 45608368

INFO [CompactionExecutor:4037] 2017-06-19 08:57:34,632 NoSpamLogger.java:91 - Maximum memory usage reached (1.000GiB), cannot allocate chunk of 1.000MiB

Debug.log

DEBUG [SlabPoolCleaner] 2017-06-19 09:12:59,260 ColumnFamilyStore.java:899 - Enqueuing flush of size_estimates: 78.229MiB (4%) on-heap, 0.000KiB (0%) off-heap DEBUG [PerDiskMemtableFlushWriter_0:6285] 2017-06-19 09:12:59,575 Memtable.java:461 - Writing Memtable-size_estimates@1285230058(21.550MiB serialized bytes, 150876 ops, 4%/0% of on/off-heap limit), flushed range = (min(-9223372036854775808), max(9223372036854775807)]

DEBUG [MemtableFlushWriter:6230] 2017-06-19 09:12:59,618 ColumnFamilyStore.java:1197 - Flushed to [BigTableReader(path='/var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21460-big-Data.db')] (1 sstables, 2.340MiB), biggest 2.340MiB, smallest 2.340MiB

DEBUG [PerDiskMemtableFlushWriter_0:6285] 2017-06-19 09:12:59,895 Memtable.java:490 - Completed flushing /var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21461-big-Data.db (15.813MiB) for commitlog position CommitLogPosition(segmentId=1496885272130, position=32411980)

DEBUG [MemtableFlushWriter:6231] 2017-06-19 09:13:00,077 ColumnFamilyStore.java:1197 - Flushed to [BigTableReader(path='/var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21461-big-Data.db')] (1 sstables, 2.340MiB), biggest 2.340MiB, smallest 2.340MiB

DEBUG [CompactionExecutor:4043] 2017-06-19 09:13:02,440 CompactionTask.java:255 - Compacted (0b5956b0-5484-11e7-b5a8-01c062b805b9) 4 sstables to [/var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21458-big,] to level=0. 20.780MiB to 13.916MiB (~66% of original) in 6,349ms. Read Throughput = 3.273MiB/s, Write Throughput = 2.192MiB/s, Row Throughput = ~131,062/s. 12 total partitions merged to 5. Partition merge counts were {2:4, 4:1, }

DEBUG [CompactionExecutor:4043] 2017-06-19 09:13:02,440 CompactionTask.java:155 - Compacting (0f221e81-5484-11e7-b5a8-01c062b805b9) [/var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21461-big-Data.db:level=0, /var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21460-big-Data.db:level=0, /var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21459-big-Data.db:level=0, /var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21458-big-Data.db:level=0, ]

DEBUG [CompactionExecutor:4043] 2017-06-19 09:13:08,453 CompactionTask.java:255 - Compacted (0f221e81-5484-11e7-b5a8-01c062b805b9) 4 sstables to [/var/lib/cassandra/data/system/size_estimates-618f817b005f3678b8a453f3930b8e86/mc-21462-big,] to level=0. 20.775MiB to 13.923MiB (~67% of original) in 6,012ms. Read Throughput = 3.455MiB/s, Write Throughput = 2.316MiB/s, Row Throughput = ~131,059/s. 8 total partitions merged to 5. Partition merge counts were {1:4, 4:1, }

Any suggestion will be helpful and appreciated.

Solution

You ran out of memory (first line of your log). Strange enough this is an info and not a warning. When you are out of memory the system might still be healthy but just not able to accept any new connections. Add more memory until you have enough to run stable.