Can't add a new Cassandra datacenter due to streaming errors

Using DSE 4.8.6 (C*

When I try adding a new node in a new datacenter, bootstraping / node rebuild is always interrupted by streaming errors.

Error example from system.log:

ERROR [STREAM-IN-/] 2016-04-19 12:30:28,531 - [Stream #743d44e0-060e-11e6-985c-c1820b05e9ae] Remote peer failed stream session.
INFO  [STREAM-IN-/] 2016-04-19 12:30:30,665 - [Stream #743d44e0-060e-11e6-985c-c1820b05e9ae] Session with / is complete

There is about 500GB of data to be streamed to the new node. Boostrap or rebuild operation stream those from 4 different nodes on the other (main) DC.

When a streaming error occurs, all synced data is wiped (and I have to start over).

What I tried so far:

  • bootstraping the node
  • setup auto_boostrap: False in cassandra.yaml and manually run nodetool rebuild
  • disabling streaming_socket_timeout_in_ms and setting up more aggressive TCP Keep Alive values in my linux conf (following advice in the CASSANDRA-9440 ticket)
  • increasing phi_convict_threshold (to the max)
  • do not bootstrap the node and use repair to stream the data (stopping the repair at a nearly full disk and 80K SSTables. After 3 days of trying to compact them, I gave up)

Any other things I should try ? I'm in the process of running nodetool scrub on every failing node to see if this helps...

On the stream out node, these are the error messages:

ERROR [STREAM-IN-/] 2016-05-11 13:10:43,842 - [Stream #ecfe0390-1763-11e6-b6c8-c1820b05e9ae] Streaming error occurred null
        at$ ~[na:1.7.0_80]
        at ~[na:1.7.0_80]
        at java.nio.channels.Channels$ ~[na:1.7.0_80]
        at org.apache.cassandra.streaming.messages.StreamMessage.deserialize( ~[cassandra-all-]
        at org.apache.cassandra.streaming.ConnectionHandler$ ~[cassandra-all-]
        at [na:1.7.0_80]

and then:

INFO  [STREAM-IN-/] 2016-05-10 07:59:14,023 - [Stream #ea1271b0-1679-11e6-917a-c1820b05e9ae] Session with / is complete
WARN  [STREAM-IN-/] 2016-05-10 07:59:14,023 - [Stream #ea1271b0-1679-11e6-917a-c1820b05e9ae] Stream failed
ERROR [STREAM-OUT-/] 2016-05-10 07:59:14,024 - [Stream #ea1271b0-1679-11e6-917a-c1820b05e9ae] Streaming error occurred
java.lang.AssertionError: Memory was freed
        at ~[cassandra-all-]
        at ~[cassandra-all-]
        at ~[cassandra-all-]
        at org.apache.cassandra.streaming.messages.FileMessageHeader.size( ~[cassandra-all-]
        at org.apache.cassandra.streaming.StreamSession.fileSent( ~[cassandra-all-]
        at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize( ~[cassandra-all-]
        at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize( ~[cassandra-all-]
        at org.apache.cassandra.streaming.messages.StreamMessage.serialize( ~[cassandra-all-]
        at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage( ~[cassandra-all-]
        at org.apache.cassandra.streaming.ConnectionHandler$ ~[cassandra-all-]


  • As answered in the Cassandra ticket CASSANDRA-11345, this issue was due to a big SSTable file (40GB) being transferred.

    The transfer of said file takes more than 1 hour and by default streaming operations time out if an outgoing transfer takes more than 1 hour.

    To change this default behavior you can set the streaming_socket_timeout_in_ms in the cassandra.yaml configuration file to a large value (eg: 72000000 ms or 20 hours)