Search code examples
apache-sparkhadoop-yarnclouderaemr

Livy Server on Amazon EMR hangs on Connecting to ResourceManager


I'm trying to deploy a Livy Server on Amazon EMR. First I built the Livy master branch

mvn clean package -Pscala-2.11 -Pspark-2.0

Then, I uploaded it to the EMR cluster master. I set the following configurations:

livy-env.sh

SPARK_HOME=/usr/lib/spark
HADOOP_CONF_DIR=/etc/hadoop/conf

livy.conf

livy.spark.master = yarn
livy.spark.deployMode = cluster

When I start Livy, it hangs indefinitely while connecting to YARN Resource manager (XX.XX.XXX.XX is the IP address)

16/10/28 17:56:23 INFO RMProxy: Connecting to ResourceManager at /XX.XX.XXX.XX:8032

However when I netcat the port 8032, it connects successfully

nc -zv XX.XX.XXX.XX 8032
Connection to XX.XX.XXX.XX 8032 port [tcp/pro-ed] succeeded!

I think I'm probably missing some step. Anyone has any idea of what this step might be?


Solution

  • I made the following changes to the config files after unzipping the livy-server-0.2.0.zip file

    livy-env.sh

    export SPARK_HOME=/usr/hdp/current/spark-client
    export HADOOP_HOME=/usr/hdp/current/hadoop-client/bin/
    export HADOOP_CONF_DIR=/etc/hadoop/conf
    export SPARK_CONF_DIR=$SPARK_HOME/conf
    export LIVY_LOG_DIR=/jobserver-livy/logs
    export LIVY_PID_DIR=/jobserver-livy
    export LIVY_MAX_LOG_FILES=10
    export HBASE_HOME=/usr/hdp/current/hbase-client/bin
    

    livy.conf

    livy.rsc.rpc.server.address=<Loop Back address>
    

    Add 'spark.master yarn-cluster' in the 'spark-defaults.conf' file which is under spark conf folder.

    Please let me know if you still have issues.