I'm working with the AWS Glue API using Boto3, specifically the get_job_runs
method as documented here.
This method allows me to retrieve metadata for all runs of a given job definition.
However, I have a scenario where I need to retrieve metadata for the last runs of multiple job definitions, and I'm looking for a way to achieve this with a single API call. Currently, I'm making separate API calls for each job definition, which is not very efficient.
Is there a way to optimize this process and retrieve the metadata for the last runs of multiple job definitions using a single API call? Any guidance or code examples using Boto3 would be greatly appreciated. Thank you!
Update: Depending on the number of jobs, this may exceed the request quota and get throttled. In this case, the Map state can be configured to automatically retry. This won't reduce the execution time, but you don't need to take care of error handling yourself. And, compared to a Lambda function, this is much more cost-effective.
At this time (August 2023), the service doesn't provide an API that aggregates the run data for multiple jobs.
However, to reduce the execution time, consider using AWS Step Functions, specifically the Map state. It allows you to:
Because Step Functions can directly integrate with the Glue API, you don't even need to write any custom code. All it needs is an array of job names.