Search code examples
amazon-web-serviceshadoophdfsvora

HdfsRpcException: Failed to invoke RPC call "getFsStats" on server


I've installed a single node Hadoop Cluster on EC2 instance. I then stored some test data on HDFS and I'm trying to load the HDFS data to SAP Vora. I'm using SAP Vora 2.0 for this project.

To create the table and load the data to Vora, this is the query I'm running:

drop table if exists dims;
CREATE TABLE dims(teamid int, team string)
USING com.sap.spark.engines.relational 
OPTIONS (
hdfsnamenode "namenode.example.com:50070",
files "/path/to/file.csv",
storagebackend "hdfs");

When I run the above query, I get this error message:

com.sap.vora.jdbc.VoraException: HL(9): Runtime error.
  (could not handle api call, failure reason : execution of scheduler plan failed:
    found error: :-1, CException, Code: 10021 : Runtime category : an std::exception wrapped.
    Next level: v2 HDFS Plugin: Exception at opening
    hdfs://namenode.example.com:50070/path/to/file.csv:
    HdfsRpcException: Failed to invoke RPC call "getFsStats" on server
    "namenode.example.com:50070" for node id 20
    with error code 0, status ERROR_STATUS

Hadoop and Vora are running on different nodes.


Solution

  • You should specify the HDFS Namenode port, which is typically 8020. 50700 is the port of the WebUI. See e.g. Default Namenode port of HDFS is 50070.But I have come across at some places 8020 or 9000