Search code examples
javaamazon-web-servicesamazon-s3

Listing files in a specific "folder" of a AWS S3 bucket


I need to list all files contained in a certain folder contained in my S3 bucket.

The folder structure is the following

/my-bucket/users/<user-id>/contacts/<contact-id>

I have files related to users and files related to a certain user's contact. I need to list both.

To list files I'm using this code:

ListObjectsRequest listObjectsRequest = new ListObjectsRequest().withBucketName("my-bucket")
                .withPrefix("some-prefix").withDelimiter("/");
ObjectListing objects = transferManager.getAmazonS3Client().listObjects(listObjectsRequest);

To list a certain user's files I'm using this prefix:

users/<user-id>/

and I'm correctly getting all files in the directory excluding contacts subdirectory, for example:

users/<user-id>/file1.txt
users/<user-id>/file2.txt
users/<user-id>/file3.txt

To list a certain user contact's files instead I'm using this prefix:

users/<user-id>/contacts/<contact-id>/

but in this case I'm getting also the directory itself as a returned object:

users/<user-id>/contacts/<contact-id>/file1.txt
users/<user-id>/contacts/<contact-id>/file2.txt
users/<user-id>/contacts/<contact-id>/

Why am I getting this behaviour? What's different beetween the two listing requests? I need to list only files in the directory, excluding sub-directories.


Solution

  • Everything in S3 is an object. To you, it may be files and folders. But to S3, they're just objects.

    Objects that end with the delimiter (/ in most cases) are usually perceived as a folder, but it's not always the case. It depends on the application. Again, in your case, you're interpretting it as a folder. S3 is not. It's just another object.

    In your case above, the object users/<user-id>/contacts/<contact-id>/ exists in S3 as a distinct object, but the object users/<user-id>/ does not. That's the difference in your responses. Why they're like that, we cannot tell you, but someone made the object in one case, and didn't in the other. You don't see it in the AWS Management Console because the console is interpreting it as a folder and hiding it from you.

    Since S3 just sees these things as objects, it won't "exclude" certain things for you. It's up to the client to deal with the objects as they should be dealt with.

    Your Solution

    Since you're the one that doesn't want the folder objects, you can exclude it yourself by checking the last character for a /. If it is, then ignore the object from the response.