Search code examples
pythonboto3amazon-emrboto

Arguments for jar file incorrect - spinning up EMR cluster using Boto3


I am writing a Python code using library Boto3 to spin up an EMR cluster. During the Steps part, I have my jar file listed. This jar file is a Scala script that takes arguments like this:

-l 'some_vaue' - s 'some_value' 

How can I correctly input these kinds of arguments in the Args value for Steps? Here is what I have:

jar_file = 'file.jar'
ARG1 = 'some_value'
ARG2 = 'some_value' 
steps = [
            {
            'Name': 'Running jar file step',   
                    'ActionOnFailure': 'CONTINUE',
                    'HadoopJarStep': {
                        'Jar': 's3://mybucket/{0}'.format(jar_file),
                        'Args': [
                            '-l {0}'.format(ARG1), '-s {1}'.format(ARG2)
                        ]
                    }
                }
        ] 

My cluster terminates with errors on the jar step, I'm getting this error:

Error: Unknown option -l some_value -s some_value
Usage: spark-zoning [options]

  -l, --id1 <value> 
  -s, --id2 <value>
Exception in thread "main" scala.MatchError: None (of class scala.None$)
    at spark_pkg.SparkMain$.main(SparkMain.scala:208)
    at spark_pkg.SparkMain.main(SparkMain.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

How can I correctly supply the arguments?


Solution

  • By @jordanm answer was correct: I changed my arguments to look like this: 'Args': ['-l', ARG1, '-s', ARG2]