Search code examples
amazon-web-servicesaws-lambdaamazon-sagemakeraws-event-bridge

EventBridge trigger: Sagemaker Processing Job finished


I'm currently developing some ETL for my ML model with AWS. The thing is that I want to trigger a Lambda when some Sagemaker Processing Job is finished. And the event passed to the Lambda, should be the configuration info (job name, arguments, etc..) of the Sagemaker Processing Job.

Q1: How can I do to trigger the event when the Processing Job is finished?

Q2: How can I do to pass the Processing Job configurations as an event for the Lambda?


Solution

  • You can use the following EventBridge rule pattern:

    {
      "source": ["aws.sagemaker"],
      "detail-type": ["SageMaker Processing Job State Change"],
      "detail": {
        "ProcessingJobStatus": ["Failed", "Completed", "Stopped"]
      }
    }
    

    The ProcessingJobStatus list can be modified based on which statuses you want to handle.

    You can set a Lambda function as the target of your EventBridge rule.

    Here is a sample event which will be passed to your Lambda, taken from AWS console:

    {
      "version": "0",
      "id": "0a15f67d-aa23-0123-0123-01a23w89r01t",
      "detail-type": "SageMaker Processing Job State Change",
      "source": "aws.sagemaker",
      "account": "123456789012",
      "time": "2019-05-31T21:49:54Z",
      "region": "us-east-1",
      "resources": ["arn:aws:sagemaker:us-west-2:012345678987:processing-job/integ-test-analytics-algo-54ee3282-5899-4aa3-afc2-7ce1d02"],
      "detail": {
        "ProcessingInputs": [{
          "InputName": "InputName",
          "S3Input": {
            "S3Uri": "s3://input/s3/uri",
            "LocalPath": "/opt/ml/processing/input/local/path",
            "S3DataType": "MANIFEST_FILE",
            "S3InputMode": "PIPE",
            "S3DataDistributionType": "FULLYREPLICATED"
          }
        }],
        "ProcessingOutputConfig": {
          "Outputs": [{
            "OutputName": "OutputName",
            "S3Output": {
              "S3Uri": "s3://output/s3/uri",
              "LocalPath": "/opt/ml/processing/output/local/path",
              "S3UploadMode": "CONTINUOUS"
            }
          }],
          "KmsKeyId": "KmsKeyId"
        },
        "ProcessingJobName": "integ-test-analytics-algo-54ee3282-5899-4aa3-afc2-7ce1d02",
        "ProcessingResources": {
          "ClusterConfig": {
            "InstanceCount": 3,
            "InstanceType": "ml.c5.xlarge",
            "VolumeSizeInGB": 5,
            "VolumeKmsKeyId": "VolumeKmsKeyId"
          }
        },
        "StoppingCondition": {
          "MaxRuntimeInSeconds": 2000
        },
        "AppSpecification": {
          "ImageUri": "012345678901.dkr.ecr.us-west-2.amazonaws.com/processing-uri:latest"
        },
        "NetworkConfig": {
          "EnableInterContainerTrafficEncryption": true,
          "EnableNetworkIsolation": false,
          "VpcConfig": {
            "SecurityGroupIds": ["SecurityGroupId1", "SecurityGroupId2", "SecurityGroupId3"],
            "Subnets": ["Subnet1", "Subnet2"]
          }
        },
        "RoleArn": "arn:aws:iam::012345678987:role/SageMakerPowerUser",
        "ExperimentConfig": {},
        "ProcessingJobArn": "arn:aws:sagemaker:us-west-2:012345678987:processing-job/integ-test-analytics-algo-54ee3282-5899-4aa3-afc2-7ce1d02",
        "ProcessingJobStatus": "Completed",
        "LastModifiedTime": 1589879735000,
        "CreationTime": 1589879735000
      }
    }
    

    Edit:

    If you want to match a ProcessingJobName with specific prefix:

    {
      "source": ["aws.sagemaker"],
      "detail-type": ["SageMaker Processing Job State Change"],
      "detail": {
        "ProcessingJobStatus": ["Failed", "Completed", "Stopped"],
        "ProcessingJobName": [{
          "prefix": "standarize-data"
        }]
      }
    }