I was trying to run a spark code on hdfs using putty.
spark-submit WorstMoviesSpark.py
But when I typed the code above, it returned an error:
python: can't open file '/home/maria_dev/WorstMoviesSpark.py': [Errno 2] No such file or directory
*edit: I was just being stupid. I loaded my code on local instead of hdp. useful answer though
You do not have the required permissions on the code file to execute it via Spark.
run the following command
hdfs dfs -chmod 777 WorstMoviesSpark.py
then in your spark-submit command mention the master as yarn when running the code as follows
spark-submit --master yarn --deploy-mode client /hdfs/path/to/WorstMoviesSpark.py