Search code examples
bashmacosshellenvironment-variablesfish

spark-submit: command not found


A very simple question:

I try to use a bash script to submit spark jobs. But somehow it keeps complaining that it cannot find spark-submit command. But when I just copy out the command and run directly in my terminal, it runs fine.

My shell is fish shell, here's what I have in my fish shell config: ~/.config/fish/config.fish:

alias spark-submit='/Users/MY_NAME/Downloads/spark-2.0.2-bin-hadoop2.7/bin/spark-submit'

Here's my bash script:

#!/usr/bin/env bash


SUBMIT_COMMAND="HADOOP_USER_NAME=hdfs spark-submit \
      --master $MASTER \
      --deploy-mode client \
      --driver-memory $DRIVER_MEMORY \
      --executor-memory $EXECUTOR_MEMORY \
      --num-executors $NUM_EXECUTORS \
      --executor-cores $EXECUTOR_CORES \
      --conf spark.shuffle.compress=true \
      --conf spark.network.timeout=2000s \
      $DEBUG_PARAM \
      --class com.fisher.coder.OfflineIndexer \
      --verbose \
      $JAR_PATH \
      --local $LOCAL \
      $SOLR_HOME \
      --solrconfig 'resource:solrhome/' \
      $ZK_QUORUM_PARAM \
      --source $SOURCE \
      --limit $LIMIT \
      --sample $SAMPLE \
      --dest $DEST \
      --copysolrconfig \
      --shards $SHARDS \
      $S3_ZK_ZNODE_PARENT \
      $S3_HBASE_ROOTDIR \
      "

eval "$SUBMIT_COMMAND"

What I've tried: I could run this command perfectly fine on my Mac OS X fish shell when I copy this command literally out and directly run. However, what I wanted to achieve is to be able to run ./submit.sh -local which executes the above shell.

Any clues please?


Solution

  • You seem to be confused about what a fish alias is. When you run this:

    alias spark-submit='/Users/MY_NAME/Downloads/spark-2.0.2-bin-hadoop2.7/bin/spark-submit'
    

    You are actually doing this:

    function spark-submit
       /Users/MY_NAME/Downloads/spark-2.0.2-bin-hadoop2.7/bin/spark-submit $argv
    end
    

    That is, you are defining a fish function. Your bash script has no knowledge of that function. You need to either put that path in your $PATH variable or put a similar alias command in your bash script.