I am doing a project that involves searching words in the Arabic script on Wiktionary, and when I do a GET request on certain word pages, I get something like this for example:
title="\xd8\xb1\xd8\xa3\xd8\xb3\xd9\x85\xd8\xa7\xd9\x84\xd9\x8a\xd8\xa9">\xd8\xb1\xd8\xa3\xd8\xb3\xd9\x85\xd8\xa7\xd9\x84\xd9\x8a\xd8\xa9</a></li>\n<li><a href="/wiki/%D8%B1%D8%A3%D8%B3%D9%8A"
This corresponds to the following URL: https://en.wiktionary.org/wiki/%D8%B1%D8%A3%D8%B3%D9%8A.
Does anyone know what the \xd8 or %D8 encodings are called? I want to say they are hex codes, but I have already looked up hex codes for the Arabic script and they certainly are not these.
The percentages you see in the url are used to substitute characters that are'nt allowed in URLs, such as special characters like "/", ":" and "&" and non ASCII characters. This is called percent encoding - https://en.m.wikipedia.org/wiki/Percent-encoding
The "\xd.." prefixed represent hexadecimal character codes, since arabic characters fall outside of UTF-8 thats how that have to be represented. Thats assuming that HTML you showed used UTF-8 encoding.