Search code examples
javajqueryajaxinternet-exploreriso-8859-1

Another JQuery encoding problem, on IE


I'm coding an italian website where I need to validate some input data with an xhr call. My code for the ajax request's like this (I'm using JQuery 1.3.2):

 $.ajaxSetup({
    type: "POST",
    timeout: 10000,
    contentType: "application/x-www-form-urlencoded; charset=iso-8859-1"        
}); 


 $.ajax({
    url: "ajaxvalidate.do",
    data: {field:controlInfo.field,value:controlInfo.fieldValue},
    dataType: "json",
    complete: function() {
        //
    },
    success: function(msg) {
        handleAsyncMsg(controlInfo, msg, closureOnError);
    },
    error: function(xhr, status, e) {            
        showException(controlInfo.id, status);

    }

});

On the backend I have a java struts action to handle the xhr. I need to use the encoding ISO-8859-1 in the page to ensure the data (specially accented characters) are sent correctly in the synchronous submit.

All's working like a charm in Firefox but when I have to handle an async post from IE 7 with accented characters I have a problem: I always receive invalid characters (utf-8 maybe?). EG I type in the form àààààààà and I get in my request this value: Ã Ã Ã Ã Ã Ã Ã Ã. Since the request charset's correctly set to ISO-8859-1 I can't understand why the server's still not parsing the form value correctly.

This is a log sample with all the request headers and the error (the server's an old Bea Weblogic 8.1):

Encoding: ISO-8859-1
Header: x-requested-with - Value: XMLHttpRequest
Header: Accept-Language - Value: it
Header: Referer - Value: https://10.172.14.36:7002/reg-docroot/conv/starttim.do
Header: Accept - Value: application/json, text/javascript
Header: Content-Type - Value: application/x-www-form-urlencoded; charset=iso-8859-1
Header: UA-CPU - Value: x86
Header: Accept-Encoding - Value: gzip, deflate
Header: User-Agent - Value: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Header: Host - Value: 10.172.14.36:7002
Header: Content-Length - Value: 65
Header: Connection - Value: Keep-Alive
Header: Cache-Control - Value: no-cache
Header: Cookie - Value: JSESSIONID=JQJlNpVC86yTZJbcpt54wzt82TnkYmWYC5VLL2snt5Z8GTsQ1pLQ!1967684811
Attribute: javax.net.ssl.cipher_suite - Value: SSL_RSA_WITH_RC4_128_MD5
Attribute: javax.servlet.request.key-size - Value: 128
Attribute: javax.servlet.request.cipher_suite - Value: TLS_RSA_WITH_RC4_128_MD5
Attribute: javax.servlet.request.key_size - Value: 128
Attribute: weblogic.servlet.network_channel.port - Value: 7001
Attribute: weblogic.servlet.network_channel.sslport - Value: 7002
Attribute: org.apache.struts.action.MESSAGE - Value: org.apache.struts.util.PropertyMessageResources@4a97dbd
Attribute: org.apache.struts.globals.ORIGINAL_URI_KEY - Value: /conv/ajaxvalidate.do
Attribute: errors - Value: org.apache.struts.util.PropertyMessageResources@4a97e4d
Attribute: org.apache.struts.action.MODULE - Value: org.apache.struts.config.impl.ModuleConfigImpl@4aa2ff8
Attribute: weblogic.servlet.request.sslsession - Value: javax.net.ssl.impl.SSLSessionImpl@42157c5
field: nome - value: àààààààà - action: /endtim

Solution

  • contentType: "application/x-www-form-urlencoded; charset=iso-8859-1"

    You can say you're sending a form submission as ISO-8859-1 in the header, but that doesn't mean you actually are. jQuery uses the standand JavaScript encodeURIComponent() method to encode Unicode strings into query-string bytes, and that always uses UTF-8.

    In any case, the ‘charset’ parameter for the MIME type ‘application/x-www-form-urlencoded’ is highly non-standard. As an ‘x-’ type there is no official MIME registration for this type, but HTML 4.01 doesn't specify such a parameter and it would be very unusual for an ‘application/*’ type. Weblogic claims to detect this construct, for what it's worth.

    So what you can do is either:

    1: create the POST body form-urlencoded content yourself, hacking it into ISO-8859-1 format manually, using something like

    function encodeLatin1URIComponent(str) {
        var bytes= '';
        for (var i= 0; i<str.length; i++)
            bytes+= str.charCodeAt(i)<256? str.charAt(i) : '?';
        return escape(bytes).split('+').join('%2B');
    }
    

    instead of encodeURIComponent().

    2: lose the ‘charset’ and leave it submitting UTF-8 as normal, and make your servlet understand incoming UTF-8. This is generally best, but will mean mucking around with the servlet container config to make it choose the right encoding. For Weblogic this seems to mean using an <input-charset> element in weblogic.xml. And by then you're looking at moving your whole app to UTF-8. Which is by no means a bad thing (non-Unicode-capable websites are sooo 20-century!) but may well be a lot of work.