Search code examples
javaamazon-web-servicesamazon-s3substring

Find objects in s3 based on a substring


I have a S3 bucket that contains objects in the following format

  • 12345-abcde.json
  • 56789-abcde.json

If I wanted to retrieve those objects from s3 based on abcde how could I do it?

I had a look at the Java SDK documentation and there seems to be no method available to filter based on a substring.

I wanted to avoid to retrieve all the objects from S3 and then filtering them with my own logic if it is possible.

Thanks


Solution

  • Currently there is no method to do such thing in SDKs.

    But you can do this with aws cli:

    aws s3 cp s3://my-bucket/  /my-dir/ --recursive --exclude "*" --include "*abcde*.json"
    

    You can run this command using OS components if you wanna automate this for bulk objects.

    For example in python we can run the above command like below:

    import subprocess
    
    aws_cli_command = 'aws s3 cp s3://your-bucket-name/ local-directory/ --recursive --exclude "*" --include "*xyz*"'
    
    try:
        os.system(aws_cli_command)
        print("Download completed successfully.")
    except subprocess.CalledProcessError as e:
        print(f"Error: {e}")