Search code examples
pythonamazon-web-servicesaws-lambdaaws-glue

What can be alternate source of input for args getResolvedOptions() method in AWS GlueJob?


I have a Glue Job in which I want to pass parameters to getResolvedOptions. One way I know is by creating a JobRun within Lambda Function, I can pass it. What are the other ways to pass param1 and param2 in code below:

import sys
from awsglue.utils import getResolvedOptions

args = getResolvedOptions(sys.argv, ['param1',
                                     'param2'])

Note: I don't want to pass parameters in code by hardcoding it.

Thanks in Advance.


Solution

  • You can easily achieve this through cloudformation (cfn) yaml templates or alternatively you could just add the variables directly to the job, via cli/sdk/console etc. If you wanted to go down the cfn route, you could define your resource as follows:

      JobNAME:
        Type: "AWS::Glue::Job"
        Properties:
          Name: String
          Description: String
          Role: String
          GlueVersion: 1.0
          Command: 
            Name: "glueetl"
            ScriptLocation: String
            PythonVersion: 3
          DefaultArguments: {
              "--job-language": "python",
              "--param1" : VALUE,
              "--param2" : VALUE,
              "--TempDir" : String,
              "--job-bookmark-option" : "job-bookmark-enable",
              "--enable-continuous-cloudwatch-log" : "false",
              "--enable-continuous-log-filter" : "false",
              "--enable-metrics" : "false"
          }
          ExecutionProperty:
            MaxConcurrentRuns: 1
          MaxCapacity: 5
          MaxRetries: 1
          Timeout: 60
    

    Once defined, you can call out the parameters through getResolvedOptions, noting there are reserved values for glue defaults, e.g.:

    import sys
    from awsglue.utils import getResolvedOptions
    
    ## @params: [JOB_NAME <--default assigned, param1 <---your value, param2 <---your value]
    args = getResolvedOptions(sys.argv, ['JOB_NAME', 'param1','param2'])