Search code examples
apache-sparkhadoop-yarn

What is different between yarn mode and deploy mode in spark?


I'm very confused right now. Please check if this is right.

4 cases command like below:

# It mean, yarn is cluster mode and deploy cluster mode.
# cluster have YARN Container(have Spark AM, Spark Driver) and YARN node manager.
spark-submit --master yarn --deploy-mode cluster

# It mean, yarn is cluster mode and deploy client mode.
# client have Spark Driver.
# cluster have YARN Container(have Spark AM, Spark Driver) and YARN node manager.
spark-submit --master yarn --deploy-mode client

# It mean, yarn is client mode and deploy cluster mode.
# cluster have YARN Container(have Spark AM) and YARN node manager.
spark-submit --master yarn-client --deploy-mode cluster

# It mean, yarn is client mode and deploy client mode.
# client have Spark Driver.
# cluster have YARN Container(have Spark AM) and YARN node manager.
spark-submit --master yarn-client --deploy-mode client

Is the explanation of the above code correct?


Solution

  • #Use yarn, deploy the driver into the yarn cluster.

    spark-submit --master yarn --deploy-mode cluster
    

    #Use yarn, deploy the driver on my local machine(machine that is launching the code)

    spark-submit --master yarn --deploy-mode client # this is the default if you don't specify --deploy-mode
    

    These aren't actual options anymore so they aren't really worth discussing:

    spark-submit --master yarn-client --deploy-mode cluster
    spark-submit --master yarn-client --deploy-mode client 
    

    --master yarn-client maybe was an option in early version of spark but isn't used today. (as referenced in the documentation above)