Search code examples
amazon-web-servicesgoamazon-s3aws-sdk-go

ListObjects operation's limit on AWS


I am going through the documentation of ListObjects function in AWS' go SDK.

(the same holds more or less for the actual API endpoint)

So the docs write:

Returns some or all (up to 1,000) of the objects in a bucket.

What does this mean? If my bucket has 200.000 objects this API call will not work?

This example uses ListObjectsPages (which calls ListObjects under the hood) and claims to list all objects.

What is the actual case here?


Solution

  • I am going through the documentation of ListObjects function in AWS' go SDK.

    Use ListObjectsV2. It behaves more or less the same, but it's an updated version of ListObjects. It's not super common for AWS to update APIs, and when they do, it's usually for a good reason. They're great about backwards compatibility which is why ListObjects still exists.

    This example uses ListObjectsPages (which calls ListObjects under the hood) and claims to list all objects.

    ListObjectsPages is a paginated equivalent of ListObjects, and ditto for the V2 versions which I'll describe below.

    Many AWS API responses are paginated. AWS uses Cursor Pagination; this means request responses include a cursor - ContinuationToken in the case of ListObjectsV2 . If more objects exist (IsTruncated in the response), a subsequent ListObjectsV2 request content can provide the ContinuationToken to continue the listing where the first response left off.

    ListObjectsV2Pages handles the iterative ListObjectsV2 requests for you so you don't have to handle the logic of ContinuationToken and IsTruncated. Instead, you provide a function that will be invoked for each "page" in the response.

    So it's accurate to say ListObjectsV2Pages will list "all" the objects, but it's because it makes multiple ListObjectsV2 calls in the backend that it will list more than one page of responses.

    Thus, ...Pages functions can be considered convenience functions. You should always use them when appropriate - they take away the pain of pagination, and pagination is critical to make potentially high volume api responses operable. In AWS, if pagination is supported, assume you need it - in typical cases, the first page of results is not guaranteed to contain any results, even if subsequent pages do.