Search code examples
pythonaws-lambdaaws-glue

How do I trigger a glue job with aws lambda using python?


Lets say I have a glue job named: FirstGlueJob

How can I trigger it using a lambda function in python?


Solution

  • We can configure a Lambda S3 event trigger on the landing folder and when a file is uploaded, we can have a brief script in Lambda to trigger the Glue job. The glue python script should have required logic to convert the input text files into a CSV files. This way your job can be run any number of times when a file is uploaded to the S3.

    Your billing is also only for the duration of the job is run. Please be aware that the cost is little high in Glue due to its managed services feature.

    Have the event trigger created , trigger the glue job. Please find herewith a code snippet for AWS Lambda:

    from __future__ import print_function
    import json
    import boto3
    import time
    import sys
    import time
    from datetime import datetime
    
    s3 = boto3.client('s3')
    glue = boto3.client('glue')
    
    def lambda_handler(event, context):
        gluejobname="<< THE GLUE JOB NAME >>"
    
        try:
            runId = glue.start_job_run(JobName=gluejobname)
            status = glue.get_job_run(JobName=gluejobname, RunId=runId['JobRunId'])
            print("Job Status : ", status['JobRun']['JobRunState'])
        except Exception as e:
            print(e)
            print('Error getting object {} from bucket {}. Make sure they exist '
                  'and your bucket is in the same region as this '
                  'function.'.format(source_bucket, source_bucket))
        raise e