Search code examples
apache-sparkjmxspark-streamingvisualvm

How to connect JMXConsole remotely to Spark streaming application


I have a Spark streaming application running in a yarn-cluster mode reading from a Kafka topic.

I want to connect JMXConsole or the Java visualvm to these remote processes in a Cloudera distribution to gather some performance benchmarks.

How would I go about doing that?


Solution

  • The way I've done this is to set/add the following property (Also start Flight Recorder):

    spark.executor.extraJavaOptions=-XX:+UnlockCommercialFeatures -XX:+FlightRecorder -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=0
    

    If you have only one worker running on each box, you can set the port to be fixed. If you have multiple, then you need to go with port 0 and the use lsof to find which port got assigned,.