Search code examples
apache-sparkemrapache-zeppelin

What is CPS in SPARK_SUBMIT_OPTIONS?


in AWS EMR /etc/zeppelin/conf/zeppelin-env.sh, it has this

export SPARK_SUBMIT_OPTIONS="$SPARK_SUBMIT_OPTIONS \
--conf 'spark.executorEnv.PYTHONPATH=/usr/lib/spark/python/lib/py4j-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-src.zip' \
--conf spark.yarn.isPython=true"

what is this <CPS> in spark.executorEnv.PYTHONPATH?


Solution

  • CPS = "classpath separator" (e.g., ':' on Linux and ';' on Windows)

    See https://issues.apache.org/jira/browse/YARN-6554 for a reference.

    It's a little odd that this setting you see is mixing both <CPS> and ':'. Really, it should probably use <CPS> in place of all of the ':'s in order to be platform independent. However, since EMR only supports running on AmazonLinux, it does not need to be as platform independent.