Search code examples
kotlinamazon-s3file-uploadaws-java-sdk-2.xdirectory-upload

AWS Java SDK v2: Upload a directory to S3


I would like to upload a directory to S3 using the AWS Java SDK v2.

For example, how would I implement the following function?

fun uploadDirectory(bucket: String, prefix: String, directory: Path)

I would like the contents of directory to be replicated at s3://bucket/prefix/ on S3.

The v2 SDK documentation has an example for uploading a single object, but there doesn't seem to be an equivalent to this Upload a Directory example from v1.


Solution

  • You can implement it by using the following strategy:

    1. Use Files.walk to walk the directory, identifying all of the files.
    2. Asynchronously upload the files using the SDK, via S3AsyncClient.putObject.
    3. Use CompletableFuture.allOf to combine all of the upload tasks, and wait for completion.

    This strategy uses the async client's default thread pool of 50 threads. This is working fine for me with directories that contain thousands of files.

    The s3Prefix here is the prefix to add to each object uploaded to the bucket, equivalent to the target directory.

    fun uploadDirectory(s3Bucket: String, s3Prefix: String, directory: Path) {
        require(directory.isDirectory())
    
        Files.walk(directory).use { stream ->
            stream.asSequence()
                .filter { it.isRegularFile() }
                .map { path ->
                    putObject(
                        s3Bucket = s3Bucket,
                        s3Key = "$s3Prefix/${directory.relativize(path)}",
                        path = path
                    )
                }
                .toList().toTypedArray()
        }.let { CompletableFuture.allOf(*it) }.join()
    }
    
    private fun putObject(s3Bucket: String, s3Key: String, path: Path)
        : CompletableFuture<PutObjectResponse> {
        val request = PutObjectRequest.builder()
            .bucket(s3Bucket)
            .key(s3Key)
            .build()
    
        return s3AsyncClient.putObject(request, path)
    }