Search code examples
cassandradatastax-enterprisecassandra-3.0datastax-java-driverspark-cassandra-connector

Datastax driver connection exception DSE 5.0 , CASSANDRA 3.0.7 ,spark


I am trying to understand the warning, every time i am seeing the below exception when i run my spark job .I am seeing this in 2 nodes of my 3 node cluster.But as i said its just warn , job succeeds how ever.

com.datastax.driver.core.exceptions.ConnectionException: [x.x.x.x/x.x.x.x:9042] Pool was closed during initialization

CASSANDRA LOG

INFO [SharedPool-Worker-1] 2017-07-17 22:25:48,716 Message.java:605 - Unexpected exception during request; channel = [id: 0xf0ee1096, /x.x.x.x:54863 => /x.x.x.x:9042] io.netty.channel.unix.Errors$NativeIoException: readAddress() failed: Connection timed out at io.netty.channel.unix.Errors.newIOException(Errors.java:105) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.unix.Errors.ioResult(Errors.java:121) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.unix.FileDescriptor.readAddress(FileDescriptor.java:134) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.epoll.AbstractEpollChannel.doReadBytes(AbstractEpollChannel.java:239) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:822) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:348) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.34.Final.jar:4.0.34.Final] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]


Solution

  • The core of the error is "Connection timed out". I recommend troubleshooting network connectivity to the Cassandra cluster, starting with simpler tools such as ping, telnet and nc. Some potential causes:

    • The Cassandra client's connection configuration included an address that is not valid (not a node in the Cassandra cluster).
    • A network misconfiguration or firewall rule is preventing connections from the client to the Cassandra server.
    • The destination Cassandra server is overloaded, such that it cannot respond to new connection requests.

    You mentioned that the problem is intermittent ("seeing this in 2 nodes of my 3 node cluster") and does not cause job failure. This could be an indicator that any of the problems listed above is happening for just a subset of nodes in the cluster. (If connectivity to all nodes was broken, then the job likely would have failed.)