Search code examples
urlidn

URL specification that allows internationalized domain names?


I'm trying to find the current spec for URLs. I found RFC3986, but that doesn't allow non-ASCII characters, like you might find in internationalized domain names (IDNA).

URLs are one of the key bits of the Internet; surely there must be a spec for them? If so, where is it?


Solution

  • Internationalized Resource Identifiers (IRIs)

    https://www.rfc-editor.org/rfc/rfc3987

    This document defines a new protocol element, the Internationalized Resource Identifier (IRI), as a complement to the Uniform Resource Identifier (URI). An IRI is a sequence of characters from the Universal Character Set (Unicode/ISO 10646). A mapping from IRIs to URIs is defined, which means that IRIs can be used instead of URIs, where appropriate, to identify resources.


    Note that Anne van Kesteren edits a "new" URL specification (at WHATWG) as a "Living Standard", which states to have the goal to "Align RFC 3986 and RFC 3987 with contemporary implementations and obsolete them in the process.".