Search code examples
javascriptjqueryajaxcharacter-encodingencodeuricomponent

encodeURIComponent appears to add a character to my string


jQuery.ajax() is doing something weird when escaping my data.

For example, if I send the request:

$.ajax({
    url: 'somethinguninteresting',
    data: {
        name: 'Ihave¬aweirdcharacter';
    }
});

then investigate the XHR in Chrome devtools, it shows the "Request Payload" as name=Ihave%C2%ACaweirdcharacter

Now, I've figured out that:

'¬'.charCodeAt(0) === 172

and that 172 is AC in hexadecimal.

Working backwards, C2 (the "extra" character being prepended) in hexadecimal is 194 in decimal, and

String.fromCharCode(194) === 'Â'

My Question:

Why does

encodeURIComponent('¬')

return '%C2%AC', which would appear to be the result of calling

encodeURIComponent('¬')

(which itself returns '%C3%82%C2%AC')?


Solution

  • Although JavaScript uses UTF-16 (or UCS-2) internally, it performs URI encoding based on UTF-8.

    The ordinal value of 172 is encoded in two bytes, because it can no longer be represented by ASCII; two-byte encoding in UTF-8 is done this way:

    110xxxxx 10xxxxxx
    

    In the place of x we fill in the binary representation of 172, which is 10101100:

    11000010 10101100 = C2AC
       ^^^
       pad
    

    This outcome is then percent encoded to finally form %C2%AC which is what you saw in the request payload.