Search code examples
aws-glueaws-step-functions

How to pass input to the task arguments in step function Map state?


I created a state machine to run some Glue/ETL jobs in parallel. I'm experimenting the Map state to take advantage of Dynamic parallelism. Here is the step function definition:

{
 "StartAt": "Map",
 "States": {
   "Map": {
     "Type": "Map",
     "InputPath": "$.data",
     "ItemsPath": "$.array",
     "MaxConcurrency": 2,
     "Iterator": {
       "StartAt": "glue job",
       "States": {
         "glue Job": {
           "Type": "Task",
           "Resource": "arn:aws:states:::glue:startJobRun.sync",
           "End": true,
           "Parameters": {
             "JobName": "glue-etl-job",
             "Arguments": {
               "--db": "db-dev",
               "--file": "$.file",
               "--bucket": "$.bucket"
          }
        }
      }
    }
  },
  "Catch": [
    {
      "ErrorEquals": [
        "States.ALL"
      ],
      "Next": "NotifyError"
    }
  ],
  "Next": "NotifySuccess"
},

}
}

The input format that been passed to the step function is like this:

{
 "data": {
   "array": [
     {"file": "path-to-file1", "bucket": "bucket-name1"},
     {"file": "path-to-file2", "bucket": "bucket-name2"},
   ]
 }
}

The problem is the file and bucket job arguments don't get resolved and they are being passed to the glue job like $.file and $.bucket. How can I pass the argument actual values from the input?


Solution

  • You need to add in the '.$' end of the parameter when using state field for parameter.

    "--file.$": "$.file",
    "--bucket.$": "$.bucket"
    

    For complete guide check out the spec sheet. https://states-language.net/spec.html#parameters