Search code examples
amazon-web-servicesaws-glueaws-step-functions

aws glue job dependency in step function


I've created 2 glue jobs (gluejob1, gluejob2).

I want create a dependency as gluejob2 should run only after gluejob1 is completed.

To orchestrate this, I created a step function with below definition:

 {
  "gluejob1": {
    "Type": "Task",
    "Resource": "gluejob1.Arn",
    "Comment": "Glue job1.",
    "Next": "gluejob2"
  },

  "gluejob2": {
    "Type": "Task",
    "Resource": "gluejob2.Arn",
    "Comment": "TGlue job2.",
    "Next": "Gluejob2 Finished Loading"
  },
  "Gluejob2 Finished Loading": {
    "Type": "Pass",
    "Result": "",
    "End": true
  }
}

When I execute this step function, state function calls it a success the moment it triggers the Gluejob1 and moves on to trigger gluejob2.

I'm wondering if there is a possibility to run gluejob2 only after gluejob1 completes.


Solution

  • You can invoke Glue job from StepFunction synchronously so that it will wait for job completion:

    {
      "StartAt": "gluejob1",
      "States": {
        "gluejob1": {
          "Type": "Task",
          "Resource": "arn:aws:states:::glue:startJobRun.sync",
          "Parameters": {
            "JobName.$": "ETLJobName1"
          },
          "Next": "gluejob2"
        },
        "gluejob2": {
          "Type": "Task",
          "Resource": "arn:aws:states:::glue:startJobRun.sync",
          "Parameters": {
            "JobName.$": "ETLJobName2"
          },
          "Next": "Gluejob2 Finished Loading"
        },
        "Gluejob2 Finished Loading": {
          "Type": "Pass",
          "Result": "",
          "End": true
        }
    }