I try to find the word with special Swedish characters "bäck" in my database, I have a jsp-page:
<%@ page pageEncoding="utf-8" contentType="text/html; charset=utf-8" %>
...
<form name="mainform" action="/web/admin/users/">
<input id="keywords" type="text" name="keywords" size="30"
value="${status.value}" tabindex="1" />
<button class="link" type="submit">Search</button>
</form>
a filter:
public class RequestResponseCharacterEncodingFilter extends OncePerRequestFilter {
private String encoding;
private boolean forceEncoding;
protected void doFilterInternal(
HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
throws ServletException, IOException {
request.setCharacterEncoding(this.encoding);
response.setCharacterEncoding(this.encoding);
filterChain.doFilter(request, response);
}
}
web.xml
<web-app ...>
...
<filter>
<filter-name>encodingFilter</filter-name>
<filter-class>test.testdomain.spring.RequestResponseCharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>
<filter-mapping>
<filter-name>encodingFilter</filter-name>
<url-pattern>/*</url-pattern>
</filter-mapping>
...
</web-app>
When I start finding the "bäck" word, It appears like this bäck
. A request is encoded into UTF-8:
but right before I exit my doFilterInternal
method in my filter in debugger I see:
What I am doing wrong? Why is the text not encoded into UTF-8?
EDIT: It is very strange, I've just tried to query in Chrome and Mozilla Firefox and there it works well, so it appears to me that I have this problem only in Internet Explorer
EDIT: Internet Explorer gives me this string: b%C3%A4ck
but Mozilla Firefox and Chrome give me the string: b%E4ck
. They are obviously different why is that?
Your screenshots indicate that your search keyword, bäck, is sent as part of the URL, as a URL parameter. It also indicates that this work seems correctly UTF-8 URL encoded. And the String you get back in your debugger is typical of ISO-Latin decoding of UTF-8 encoded bytes : e.g. the HTTPServletRequest parser used ISO-Latin parsing for a UTF-8 encoded string.
So, your ServletFilter is of no help in interpreting it :
request.setCharacterEncoding(this.encoding);
response.setCharacterEncoding(this.encoding);
Because as the javadoc says : these methods work on the body of HTTP request, not on its URLs.
/**
* Overrides the name of the character encoding used in the body of this
* request. This method must be called prior to reading request parameters
* or reading input using getReader(). Otherwise, it has no effect.
*
Seeing URL parameter parsing is a responsability of your Servlet container, the setting you should look at probably is a container level one. For example, on Tomcat, as stated in the documentation at : http://tomcat.apache.org/tomcat-7.0-doc/config/http.html :
URIEncoding : This specifies the character encoding used to decode the URI bytes, after %xx decoding the URL. If not specified, ISO-8859-1 will be used.
By default, it uses ISO-8859-1. You should change that to UTF-8, and then, your request parameters will be correctly parsed from your servlet container, and passed to the HTTPServletRequest object.
EDIT : As you are seeing inconsistent browser behaviour, you may look into the consistency of your HTML form. Please make sure that