Search code examples
amazon-web-servicesamazon-s3aws-sdk-java

Upload with high level multipart still gives no content length specified warning


Even though I'm using the high level multipart I'm still getting the warning in the console:

WARN - com.amazonaws.services.s3.AmazonS3Client - No content length specified for stream data.  Stream contents will be buffered in memory and could result in out of memory errors.

This is how I use the high level multipart upload just like here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpu-upload-object.html

      val tm: TransferManager = TransferManagerBuilder
        .standard()
        .withS3Client(s3Client)
        .withMultipartUploadThreshold(5248000)
        .build();

      val metadata = new ObjectMetadata()
      metadata.setContentType(mimeType)
      val request = new PutObjectRequest(bucketName, key, inputStream, metadata)

      val upload = tm.upload(request)
      upload.waitForCompletion()

5248000 is 5MB and I tried uploading files much larger than that so it should have used the multipart strategy as it says in the withMultipartUploadThreshold docs:

Sets the size threshold, in bytes, for when to use multipart uploads. Uploads over this size will automatically use a multipart upload strategy, while uploads smaller than this threshold will use a single connection to upload the whole object.

Why does it still give this warning?


Solution

  • The documentation for the AWS SDK for Java mentions this on ObjectMetadata:

    This field is required when uploading objects to S3, but the Amazon Web Services S3 Java client will automatically set it when working directly with files. When uploading directly from a stream, set this field if possible. Otherwise the client must buffer the entire stream in order to calculate the content length before sending the data to Amazon S3.

    In other words, you'll need to explicitly call metadata.setContentLength(x) with the file or stream size before constructing the PutObjectRequest object. If you don't, the AWS SDK will need to buffer the entire stream in ram, potentially exhausting memory for larger objects, which triggers the warning you're seeing.