Search code examples
rcassandrarjdbc

Unable to connect cassandra through R


I am trying to follow an example given on "http://www.datastax.com/dev/blog/big-analytics-with-r-cassandra-and-hive" to connect R with Cassandra. Following is my code:

library(RJDBC)

    #Load in the Cassandra-JDBC diver
    cassdrv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver", list.files("D:/cassandra/lib",pattern="jar$",full.names=T))

    #Connect to Cassandra node and Keyspace
    casscon <- dbConnect(cassdrv, "jdbc:cassandra://127.0.0.1:9042/demodb")

When I run above code in R, I get following error:

Error in .jcall(drv@jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1],  : 
  java.sql.SQLNonTransientConnectionException: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2113929216)!

On the Cassandra server window get the following error for the above code:

ERROR 14:41:26,671 Unexpected exception during request
java.lang.ArrayIndexOutOfBoundsException: 34
        at org.apache.cassandra.transport.Message$Type.fromOpcode(Message.java:1
06)
        at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:168)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDeco
der.java:425)
        at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(Fram
eDecoder.java:303)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:26
8)
        at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:25
5)
        at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(Abstract
NioWorker.java:109)
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNi
oSelector.java:312)
        at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioW
orker.java:90)
        at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.lang.Thread.run(Unknown Source)

I tried to change port from 9042 to 9160 then request won't reach server in that case. I also tried to increase the size of thrift_framed_transport_size_in_mb from 15 to 500 but the error is same.

The Cassandra is otherwise running fine and database is connected/updated easily through "devcenter".

R version: R-3.1.0,
Cassandra version: 2.0.8,
Operating System: Windows,
XP Firewall: off

Solution

  • Finally I was able to connect to cassandra through R. I followed the following steps:

    1. I updated my java 7 and R to the latest version.
    2. Then, I reinstalled RJDBC, rJava, DBI
    3. Then, I used the following code, and successfully got connected:

      library(RJDBC)
      
      drv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver", list.files("D:/cassandra/lib/",pattern="jar$",full.names=T))
      
      .jaddClassPath("D:/mysql-connector-java-3.1.14/cassandra-clientutil-1.0.2.jar")
      
      conn <- dbConnect(drv, "jdbc:cassandra://127.0.0.1:9160/demodb")
      
      res <- dbGetQuery(conn, "select * from emp")
      
      
      
      # print values
      res