I am using the boto3 api, but open to using CLI if it gives any more flexibility.
client = boto3.session.Session(profile_name="prod").client("ecr", region_name="us-east-1")
response = client.describe_images(repositoryName=repository_name)
What I used to do is do above and just filter on date by using sorted(response["imageDetails"], key=lambda x: x["imagePushedAt"])
. However, I am only getting 90 responses, and wondering if this is an internal limit.
I see that there is a filter
argument to describe_images
and I tried to do it by doing the following but get the error: *** botocore.exceptions.ParamValidationError: Parameter validation failed: Unknown parameter in filter: "Name", must be one of: tagStatus Unknown parameter in filter: "Values", must be one of: tagStatus
import datetime
date_filter = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime("%Y-%m-%d")
filter={"Name": "imagePushedAt", "Values": [date_filter+"*"]}
response = client.describe_images(repositoryName=repository_name, filter=filter)
Any thoughts on how to filter ECR images?
When you look at the BOTO3 documentation for (ecr.describe_images)[https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/ecr.html#ECR.Client.describe_images], you will see a few things:
nextToken
parametermaxResults
parameterThe nextToken
is used to iterate and get additional batches of results. The function is limited to 100 results per batch by default, though you can push that up to 1000 if you want. If (or when) you have more than 1000 images in your repository you can't avoid using the pagination support, alas.
You can use the client.get_paginator()
approach, if you prefer. Below, I'm building on top of what you already started with, though.
The other thing you'll note in the documentation is the filter
parameter, which is indeed fairly limited as you noticed.
You mentioned finding 90 images, which suggests that you had 10 images from before the last week, in those first 100 images.
Here's one way to iterate over all images, and checking the imagePushedAt
of each of them:
#!/usr/bin/env python3
import boto3
import datetime
my_profile = "prod"
my_region = "us-east-1"
my_repo = <repository_name>
date_threshold = (datetime.datetime.now() - datetime.timedelta(days=7)).strftime("%Y-%m-%d")
print("date threshold = {}".format(date_threshold))
batch = 0
image = 0
total = 0
client = boto3.session.Session(profile_name=my_profile).client("ecr", region_name=my_region)
nextToken = {}
while True:
response = client.describe_images(repositoryName=my_repo, **nextToken)
batch = batch + 1
print("Batch {}:".format(batch))
for entry in response["imageDetails"]:
total = total + 1
if entry["imagePushedAt"].strftime("%Y-%m-%d") >= date_threshold:
image = image + 1
print(" Image #{}/{}: {} pushed at {}".format(image, total, entry["imageDigest"], entry["imagePushedAt"]))
if "nextToken" in response:
nextToken = { "nextToken": response["nextToken"] }
else:
break
print("Done - {} batch(es), {} images match out of {}".format(batch, image, total))
If you would like to go in batches of more than the default 100 at a time, you can add that maxResults
parameter in the describe_images()
call.
Hope that helps!