Search code examples
javaurlconnection

why urlConnection.getContentType() is giving null for some images reading from an url?


I am working on Java 7 and tried to read mime type from an URL by below code. In maximum scenario urlConnection.getContentType() gives content type but in some specific scenario it gives null.

For example, in the below code, I am able to read mime type for url2 but url1 is giving null.

import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;

class readMimeType{

    public static void main(String args[]) {
        String url1 = "https://akumyndigitalcontent.blob.core.windows.net/visitattachments/1804915_0_2_87_.jpeg";
        String url2 = "https://gigwalk-multitenant-api-server.s3.amazonaws.com/public_uploads/62ae090584074fefeeada538c5ceb206fedf58f9e9a2aef463908fb53793bd64a28ed152427f96eb923cb789e947a6984db1c3460fcf373fb589b9e3051f6ef8/9a71308d-3da2-4e96-88b9-cc75a7470db3";

        try {
            URL serverUrl = new URL(url1);
            URLConnection urlConnection = serverUrl.openConnection();
            HttpsURLConnection httpConnection = (HttpsURLConnection) urlConnection;
            httpConnection.setInstanceFollowRedirects(false);
            httpConnection.setDoOutput(true);

            InputStream initialStream = httpConnection.getInputStream();

            String mimeType = urlConnection.getContentType();

            System.out.println("mimeType::::" + mimeType);
        } catch (Exception exception) {

        }
    }
}

Solution

  • In the URLConnection#getContentType documentation, it says

    Returns the value of the content-type header field.

    So if the header value is missing the content-type header, the method will return null.

    Use curl to check:

    curl -I https://akumyndigitalcontent.blob.core.windows.net/visitattachments/1804915_0_2_87_.jpeg
    
    HTTP/1.1 200 OK
    Cache-Control: public, max-age=31622400
    Content-Length: 2794649
    Last-Modified: Sun, 02 Jun 2019 00:25:00 GMT
    ETag: 0x8D6E6F0BFBA22BC
    Vary: Origin
    Server: Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0
    x-ms-request-id: f5962c19-901e-0083-78f8-20a0ca000000
    x-ms-version: 2009-09-19
    x-ms-lease-status: unlocked
    x-ms-blob-type: BlockBlob
    Date: Wed, 12 Jun 2019 08:24:58 GMT
    
    curl -I https://gigwalk-multitenant-api-server.s3.amazonaws.com/public_uploads/62ae090584074fefeeada538c5ceb206fedf58f9e9a2aef463908fb53793bd64a28ed152427f96eb923cb789e947a6984db1c3460fcf373fb589b9e3051f6ef8/9a71308d-3da2-4e96-88b9-cc75a7470db3
    
    HTTP/1.1 200 OK
    x-amz-id-2: LXyjyXfMWNmwYfkUhiGnbyJBE4WovVwUTNi7ELXmDYpLtwGHVl1BfBPYgxgDazK44sIIwXFMv+4=
    x-amz-request-id: FF7CE75150E28EB3
    Date: Wed, 12 Jun 2019 08:25:15 GMT
    Last-Modified: Thu, 11 Oct 2018 02:15:15 GMT
    ETag: "15ad210d28be6a37af2c0e37a5c30e6b"
    x-amz-storage-class: STANDARD_IA
    Accept-Ranges: bytes
    Content-Type: image/jpeg
    Content-Length: 200785
    Server: AmazonS3
    

    As you can see, only one of them has the content-type field in the response headers.

    An alternative way is to download the file and check. See: https://www.baeldung.com/java-file-mime-type