Search code examples
javadebuggingapache-httpclient-4.xapache-httpcomponentsapache-httpclient-5.x

Apache HttpClient is not showing Content-Length and Content-Encoding headers of the response


I installed Apache httpcomponents-client-5.0.x and while reviewing the headers of the http response, I was shocked it doesn't show the Content-Length and Content-Encoding headers, this is the code I used for testing

import org.apache.hc.client5.http.impl.classic.CloseableHttpClient;
import org.apache.hc.client5.http.impl.classic.CloseableHttpResponse;
import org.apache.hc.client5.http.impl.classic.HttpClients;
import com.sun.net.httpserver.Headers;

CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet request = new HttpGet(new URI("https://www.example.com"));
CloseableHttpResponse response = httpclient.execute(request);
Header[] responseHeaders = response.getHeaders();
for(Header header: responseHeaders) {               
    System.out.println(header.getName());
}
// this prints all the headers except 
// status code header
// Content-Length
// Content-Encoding

No matter what I try I get the same result, like this

Iterator<Header> headersItr = response.headerIterator();
while(headersItr.hasNext()) {
    Header header = headersItr.next();
    System.out.println(header.getName());
}

Or this

HttpEntity entity = response.getEntity();
System.out.println(entity.getContentEncoding()); // NULL
System.out.println(entity.getContentLength());   // -1

According to this question that has been asked 6 years ago, it seems like an old issue even with older versions of Apache HttpClient.

Of-course the server is actually returning those headers as confirmed by Wireshark, and Apache HttpClient logs itself

2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << HTTP/1.1 200 OK
2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Encoding: gzip
2020-04-03 07:59:09,106 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Accept-Ranges: bytes
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Age: 451956
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Cache-Control: max-age=604800
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Type: text/html; charset=UTF-8
2020-04-03 07:59:09,107 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Date: Fri, 03 Apr 2020 05:59:09 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Etag: "3147526947+gzip"
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Expires: Fri, 10 Apr 2020 05:59:09 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Last-Modified: Thu, 17 Oct 2019 07:18:26 GMT
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Server: ECS (dcb/7EEB)
2020-04-03 07:59:09,108 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Vary: Accept-Encoding
2020-04-03 07:59:09,109 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << X-Cache: HIT
2020-04-03 07:59:09,109 DEBUG [org.apache.hc.client5.http.headers] http-outgoing-0 << Content-Length: 648

BTW, java.net.http library known as JDK HttpClient works great and show all the headers.

Is there something wrong I did, or should I report a bug that been there for years ?


Solution

  • HttpComponents committer here...

    You did not closely pay attention what Dave G said. By default, HttpClientBuilder will enable transparent decompression and the reason why you don't see some headers anymore is here:

    if (decoderFactory != null) {
      response.setEntity(new DecompressingEntity(response.getEntity(), decoderFactory));
      response.removeHeaders(HttpHeaders.CONTENT_LENGTH);
      response.removeHeaders(HttpHeaders.CONTENT_ENCODING);
      response.removeHeaders(HttpHeaders.CONTENT_MD5);
    } ...
    

    Regarding the JDK HttpClient, it will not perform any transparent decompression, therefore you see the length of the compressed stream. You have to decompress on your own.

    curl committer here...

    I have raised an issue too.

    Update: 03 Feb. '23 The secret codez to disable automatic decompression are:

    CloseableHttpClient httpclient = HttpClients.createSimple();
    // OR
    CloseableHttpClient httpclient = HttpClients.custom().disableContentCompression().build();