Search code examples
javahttpamazon-s3uri

java.net.URI get host with underscores


I got a strange behavior of that method:

import java.net.URI

    URI url = new URI("https://pmi_artifacts_prod.s3.amazonaws.com");
    System.out.println(url.getHost()); /returns NULL
    URI url2 = new URI("https://s3.amazonaws.com");
    System.out.println(url2.getHost());  //returns s3.amazonaws.com

`

i want first url.getHost() to be pmi_artifacts_prod.s3.amazonaws.com, but it gives me NULL. Turned out that problem is with underscores in domain name, its a known bug, but still what can be done as I need to work with this host exactly?


Solution

  • The bug is not in Java but in naming the host, since an underscore is not a valid character in a hostname. Although widely used incorrectly, Java refuses to handle such hostnames.

    https://en.wikipedia.org/wiki/Hostname#Restrictions_on_valid_hostnames

    A possible workaround:

    public static void main(String...a) throws URISyntaxException, NoSuchFieldException, SecurityException, IllegalArgumentException, IllegalAccessException {
        URI url = new URI("https://pmi_artifacts_prod.s3.amazonaws.com");
        System.out.println(url.getHost()); //NULL
    
    
        URI uriObj = new URI("https://pmi_artifacts_prod.s3.amazonaws.com");
        if (uriObj.getHost() == null) {
            final Field hostField = URI.class.getDeclaredField("host");
            hostField.setAccessible(true);
            hostField.set(uriObj, "pmi_artifacts_prod.s3.amazonaws.com");
        }
        System.out.println(uriObj.getHost()); //pmi_artifacts_prod.s3.amazonaws.com
    
    
        URI url2 = new URI("https://s3.amazonaws.com");
        System.out.println(url2.getHost());  //s3.amazonaws.com
    }