Search code examples
rhadoopmapreducejava-6rhadoop

R is not connecting to HDFS


Why is R not connecting to Hadoop ?

I am using R to connect to HDFS using 'rhdfs' package. The 'rJava' package is installed and rhdfs package is loaded.

The HADOOP_CMD environment variable is set in R using:

Sys.setenv(HADOOP_CMD='/usr/local/hadoop/bin')

But when hdfs.init() function is given, the following error message is generated:

sh: 1: /usr/local/hadoop/bin: Permission denied
Error in .jnew("org/apache/hadoop/conf/Configuration") : 
java.lang.ClassNotFoundException
In addition: Warning message:
running command '/usr/local/hadoop/bin classpath' had status 126 

Also, 'rmr2' library was loaded, and the following code was typed:

ints = to.dfs(1:100)

which generated the message given below:

sh: 1: /usr/local/hadoop/bin: Permission denied

The R-Hadoop packages are accessible only to the 'root' user and not 'hduser' (Hadoop user), since they were installed when R was run by the 'root' user.


Solution

  • try like this.

    Sys.setenv(HADOOP_CMD='/usr/local/hadoop/bin/hadoop')
    
    Sys.setenv(JAVA_HOME='/usr/lib/jvm/java-6-openjdk-amd64')
    
    library(rhdfs)
    
    hdfs.init()
    
    please give the correct HADOOP_CMD path  extend with   /bin/hadoop