Search code examples
amazon-ec2cassandradatastax-enterprisespark-cassandra-connectorbeeline

spark-cassandra thrift server on ec2 throws SparkException on query from beeline


I installed cassandra spark-hadoop cluster on 3 ec2 nodes. Yesterday, I was able to start the spark thrift server on node0, and actually executed a simple sql statement in beeline. Today, after a schema change, I restarted the thrift server, now I get a SparkException java.lang.IllegalArgumentException: ip-172-30-4-140 at org.apache.hadoop.hive.cassandra.cql3.input.HiveCqlInputFormat.getRecordReader(HiveCqlInputFormat.java:212)

the ip-172-30-4-140 is simply the private ip of that node

I tried running the same sequence from the other two cassandra nodes, and for those, the sql statement gets stuck and never returns.

What is this error? any one knows?


Solution

  • ok, I found the problem.

    The default value for the host parameter points to the internal ip DNS of the ec2, which causes the exception. It needs to be explicitly declared

    sudo dse spark-sql-thriftserver start hive.server2.thrift.bind.host=your-ec2-private-ip