I created a URL validator for my JSF web page and now stumbled across a problem with domains where the first word (separated by dot) contains a non ASCII character.
I have the following valid website URL:
http://testä.com
Converting it to puny code using IDN.toASCII()
creates invalid URL:
xn--http://test-v8a.com
Should it not be http://xn--test-ooa.com/
?
I also checked it at German de
domain manager DENIC which shows same invalid URL results.
Is this a BUG in Java/RFC or am I missing something?
When I remove the protocol at first it works.
The documentation is clear that this method only operates on domain name labels, so yes the protocol needs to be removed.
A label is an individual part of a domain name. The original ToASCII operation, as defined in RFC 3490, only operates on a single label. This method can handle both label and entire domain name, by assuming that labels in a domain name are always separated by dots.
Link to Javadoc: https://docs.oracle.com/javase/8/docs/api/java/net/IDN.html#toASCII-java.lang.String-int-