Search code examples
amazon-web-serviceserror-handlingaws-lambdaamazon-ses

How to capture the sagemaker error in case it fails and notify via SES,SNS


I have a lambda function that creates a sagemaker processing job. Now let's say the sagemaker job fails due an algorithm error or an API error etc, How do I capture the exact error message(for ex, see picture) and send an email from the same lambda function or it can be a separate event?

https://anonfiles.com/d308Jf15ue/2021-06-17_22_36_21-Amazon_SageMaker_png


Solution

  • Here's what I did. I used cloudwatch events for monitoring and in event rule target I gave SNS topic to which my email which was subscribed. Here's the event Pattern that i used.

    {
      "source": ["aws.sagemaker"],
      "detail-type": ["SageMaker Processing Job State Change"],
      "detail": {
        "ProcessingJobStatus": ["Failed"]
      }
    }

    Cloud watch event target also has input transformer where you can fetch data from received cloudtrail event and pass it to SNS.The event data should have the error message.

    https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CloudWatch-Events-Input-Transformer-Tutorial.html