HtmlUnit - HTMLParser (page with characters)

I have a resource (a static html page), that I wanna use to test. But, when I get the static page, it comes with some characters encoding. I try with the class StringEscapeUtils but it doesn't work. My function:

  private HtmlPage getStaticPage() throws IOException, ClassNotFoundException {
    final Reader reader = new InputStreamReader(this.getClass().getResourceAsStream("/" + "testPage" + ".html"), "UTF-8");
    final StringWebResponse response = new StringWebResponse(StringEscapeUtils.unescapeHtml4(IOUtils.toString(reader)), StandardCharsets.UTF_8, new URL(URL_PAGE));
    return HTMLParser.parseHtml(response, WebClientFactory.getInstance().getCurrentWindow());
}

import org.apache.commons.lang3.StringEscapeUtils;

Solution

final Reader reader = new InputStreamReader(this.getClass().getResourceAsStream("/" + "testPage" + ".html"), "UTF-8");

For the reader use the encoding of the file (from your comment i guess this is windows-1252 in your case). Then read the file into an string (e.g. use commons.io).

Then you can process it like this

final StringWebResponse tmpResponse = new StringWebResponse(anHtmlCode,
    new URL("http://www.wetator.org/test.html"));
final WebClient tmpWebClient = new WebClient(aBrowserVersion);
try {
  final HtmlPage tmpPage = HTMLParser.parseHtml(tmpResponse, tmpWebClient.getCurrentWindow());
  return tmpPage;
} finally {
  tmpWebClient.close();
}

If you still have problem please make a simple sample out of your page that shows your problem and upload it here together with your code.