Search code examples
amazon-web-servicesboto3

How do I filter for ECR images created in the past week


I am using the boto3 api, but open to using CLI if it gives any more flexibility.

client = boto3.session.Session(profile_name="prod").client("ecr", region_name="us-east-1")
response = client.describe_images(repositoryName=repository_name)

What I used to do is do above and just filter on date by using sorted(response["imageDetails"], key=lambda x: x["imagePushedAt"]). However, I am only getting 90 responses, and wondering if this is an internal limit.

I see that there is a filter argument to describe_images and I tried to do it by doing the following but get the error: *** botocore.exceptions.ParamValidationError: Parameter validation failed: Unknown parameter in filter: "Name", must be one of: tagStatus Unknown parameter in filter: "Values", must be one of: tagStatus

import datetime

date_filter = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime("%Y-%m-%d")
filter={"Name": "imagePushedAt", "Values": [date_filter+"*"]}
response = client.describe_images(repositoryName=repository_name, filter=filter)

Any thoughts on how to filter ECR images?


Solution

  • When you look at the BOTO3 documentation for (ecr.describe_images)[https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ecr.html#ECR.Client.describe_images], you will see a few things:

    • An optional nextToken parameter
    • An optional maxResults parameter

    The nextToken is used to iterate and get additional batches of results. The function is limited to 100 results per batch by default, though you can push that up to 1000 if you want. If (or when) you have more than 1000 images in your repository you can't avoid using the pagination support, alas.

    You can use the client.get_paginator() approach, if you prefer. Below, I'm building on top of what you already started with, though.

    The other thing you'll note in the documentation is the filter parameter, which is indeed fairly limited as you noticed.

    You mentioned finding 90 images, which suggests that you had 10 images from before the last week, in those first 100 images.

    Here's one way to iterate over all images, and checking the imagePushedAt of each of them:

    #!/usr/bin/env python3
    
    import boto3
    import datetime
    
    my_profile = "prod"
    my_region  = "us-east-1"
    my_repo    = <repository_name>
    
    date_threshold = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime("%Y-%m-%d")
    print("date threshold = {}".format(date_threshold))
    
    batch = 0
    image = 0
    total = 0
    client = boto3.session.Session(profile_name=my_profile).client("ecr", region_name=my_region)
    nextToken = {}
    while True:
        response = client.describe_images(repositoryName=my_repo, **nextToken)
        batch = batch + 1
        print("Batch {}:".format(batch))
        for entry in response["imageDetails"]:
            total = total + 1
            if entry["imagePushedAt"].strftime("%Y-%m-%d") >= date_threshold:
                image = image + 1
                print("  Image #{}/{}: {} pushed at {}".format(image, total, entry["imageDigest"], entry["imagePushedAt"]))
        if "nextToken" in response:
            nextToken = { "nextToken": response["nextToken"] }
        else:
            break
    
    print("Done - {} batch(es), {} images match out of {}".format(batch, image, total))
    

    If you would like to go in batches of more than the default 100 at a time, you can add that maxResults parameter in the describe_images() call.

    Hope that helps!