I am writing a code that will download a file from URL and upload it to S3, but I don't want it to be stored temporarily in file or memory, I am downloading through 'InputStream' but AWS s3 requires the file size which I don't have from 'InputStream' is there any other way. I found the this discussion on same topic using 'Node.js'
My Code to Fetch the file in inputStream
HttpClient client = HttpClient.newBuilder().build();
URI uri = URI.create("{myUrl}");
HttpRequest request = HttpRequest.newBuilder().uri(uri).build();
InputStream is = client.send(request, HttpResponse.BodyHandlers.ofInputStream()).body();
Code I tried to insert into S3, but I don't have content_length
S3Client s3Client = S3Client.builder().build();
PutObjectRequest objectRequest = PutObjectRequest.builder()
.bucket(BUCKET_NAME)
.key(KEY)
.build();
PutObjectResponse por = s3Client.putObject(objectRequest, RequestBody.fromInputStream(is,content_length));
You have a few options.
The easiest is to retain the HttpResponse
from your client.send()
, and get the Content-Length
header from it. You should also be looking for headers like Content-Type
, and storing them as metadata on the S3 object.
That isn't guaranteed to work in all cases: some servers do not provide Content-Length
. In that case you need to create a multipart upload to send the file. When doing this, you buffer relatively small chunks (minimum 5 MB) in memory but can upload up to 10,000 chunks. You must either complete or abort the upload, or configure your bucket to delete uncompleted uploads after a certain period of time; if not, you'll be charged for incomplete uploads.
A third alternative is to use the V1 SDK, which gives you TransferManager
. That handles the multi-part upload for you, and uses multiple threads to improve bandwidth. But it still hasn't been implemented for V2.