in AWS EMR /etc/zeppelin/conf/zeppelin-env.sh
, it has this
export SPARK_SUBMIT_OPTIONS="$SPARK_SUBMIT_OPTIONS \
--conf 'spark.executorEnv.PYTHONPATH=/usr/lib/spark/python/lib/py4j-src.zip:/usr/lib/spark/python/:<CPS>{{PWD}}/pyspark.zip<CPS>{{PWD}}/py4j-src.zip' \
--conf spark.yarn.isPython=true"
what is this <CPS>
in spark.executorEnv.PYTHONPATH
?
CPS = "classpath separator" (e.g., ':' on Linux and ';' on Windows)
See https://issues.apache.org/jira/browse/YARN-6554 for a reference.
It's a little odd that this setting you see is mixing both <CPS> and ':'. Really, it should probably use <CPS> in place of all of the ':'s in order to be platform independent. However, since EMR only supports running on AmazonLinux, it does not need to be as platform independent.