Search code examples
pythonaws-lambdaaws-glue

Starting an AWS Glue job from Lambda using Python?


I am attempting to start an AWS Glue job (Python shell) via a Lambda when a new file is dropped into an S3 bucket. I have the Glue job setup and it operates as expected if I run manually. I thought that triggering the Glue job from a Lambda triggered by the S3 create would be simple. So far I have the Lambda created and it does run when the S3 file is created, however it does not want to actually start the Glue job, or provide any feedback as to why it won't start. Below is the Python 3.8 code I have been using in my Lambda:

import boto3
from botocore.exceptions import ClientError

def handler(event, context):
   glue_client = boto3.client('glue')
   job_name = 'my-glue-job-name'
   
   try:
      print('Attempting to start glue job:', job_name)
      job_run_id = glue_client.start_job_run(JobName=job_name)
      print('Running Glue job, id:', job_run_id)
      return job_run_id
   except ClientError as e:
      print('>>>>>error 1:', e)
      raise Exception( "boto3 client error in run_glue_job: " + e.__str__())
   except Exception as e:
      print('>>>>>error 2:', e)
      raise Exception( "Unexpected error in run_glue_job: " + e.__str__())

When i check the Lambda's logs I can see the lambda started when the file was created in S3. I can see the print entry 'Attempting to start glue job: my-glue-job-name'. And thats all I see. I don't see 'Running Glue job, id: xxx' printed log entry, nor do I see any error messages. Likewise the Glue job logs show no indication of being started.

I have given the Lambda the AWSGlueServiceRole policy so I don't think its a permissions issue.

Any ideas are appreciated.


Solution

  • Turns out it wasn't an IAM permissions issue at all but instead a VPC issue. All of our Glue service need a VPC endpoint added to allow access from other services in our account. Once that was done it worked as expected.