Search code examples
character-encodingsaxonxpath-2.0

Why does fn:encode-for-uri('§') result in %C2%A7 rather than just %A7?


In Oxygen XML Editor 27.0, using the "XPath/XQuery Builder" (which, as far as I know, makes use of Saxon as XPath/XQuery processor), when I execute the XPath 2.0 query encode-for-uri('§'), I get %C2%A7 as a result. Where does the %C2 come from?

Encoding other "special characters" like $, (, | and so on, I get only the respective hexadecimal ASCII code (i.e. just one, not two - for instance: | => %7C).

Why is this different with §?


Solution

  • From fn:encode-for-uri:


    Like the fn:escape-html-uri and fn:iri-to-uri functions, this function replaces each special character with an escape sequence in the form %xx, where xx is two hexadecimal digits (in uppercase) that represent the character in UTF-8. For example, édition.html is changed to %C3%A9dition.html, with the é escaped as %C3%A9.

    Hence, § (U+00A7, Section Sign) is encoded as %C2%A7