amazon-s3 apache-httpclient-4.x apache-httpcomponents spring-restcontroller spring-rest

How does ChunkedInputStream work internally with Apache HttpClient 4.x?

I am bit new to Apache HC API. I am trying to download huge file from server (10 GB) on cloud environment and then I have to upload on to Amazon S3.

As file is too big, it comes up with transfer encoding as chunked and gzip format. Cloud env neither have enough disk space to store this file as temp file nor such file can be accommodated in memory.

Mainly I have 2 interfaces,

ResourceDownloader {
  InputStream download(AbstractChannel channel);
 }

ResourceUploader{
   void upload(AbstractChannel channel, InputStream inputStream);
}

Part1:

While using Apache Httpclient lib, I am seeing return http response with following structure,

ResponseEntityProxy contains >> {
 - BasicHttpEntity [WrappedEntity]
 - content as ChunkeInputStream
}

So does this response mean whole 10GB would be available in memory byte buffer at client side after completion of client.execute(getMethod) call?

Or is it like as soon as I invoke call to read as below, it would get chunks from server? [In real case, disk would not be available , but below is just for demo]

  try {
        FileOutputStream fos = (FileOutputStream) outputStream;

        if(inputStream instanceof GZIPInputStream) {
            byte[] buffer = new byte[1024];
            int len;
            while((len = inputStream.read(buffer)) != -1){
                fos.write(buffer, 0, len);
            }
            //close resources
            fos.close();
            inputStream.close();
        }

    } catch (IOException e) {
        logger.error("Exception occurred while processing file on disk", e);
    }

Part2:

I know multi part upload if I have content length or full file available, but in case of chunked input stream, how should we upload it to Amazon S3?

Thanks, Dharam

Solution

HttpClient always streams request and response entities unless specifically instructed to do otherwise.