Search code examples
javacharacter-encodingspecial-charactersoutputstreamcom.sun.net.httpserver

Special character encoding in my simple Java HTTPServer


I have a simple Java application, basically a server implemented using com.sun.net.HttpServer API, that reads a file and simply sends back the texts after some processing. The server part simply looks like this:

        server = HttpServer.create(new InetSocketAddress(serverPort), 0);
        logger.info("EventRetriever REST server listening to port: " + serverPort);
        server.createContext("/getEvents", new MedatadaHandler());
        server.setExecutor(null);
        server.start();
// ...
@Override
        public void handle(HttpExchange he) throws IOException {
        //...
        String response = requestEvents();
        he.sendResponseHeaders(200, response.length());
        OutputStream os = he.getResponseBody();
        os.write(response.toString().getBytes());
        os.close();
}
//...
public String requestEvents(){
//...
// this printing on the console looks fine though:
        logger.info(jsonString);
        return jsonString;
}

I run my jar file with java -jar myApp.jar on a command line or simply on my IDE. I'm witnessing some weird behaviors, sometimes just hanging, when it requires sending texts containing special characters, such as the music symbol . When I call the IP:PORT/getEvent via a browser, the behavior is so weird:

If I run it on a Windows Powershell or Command Prompt, the symbol appears as ? on the console, and what I get from the browser is also shown as ?. But when I run the program on a linux server or my Eclipse IDE, it is shown correctly on the console (as ), but on the browser, I get the following error, although the status is 200 OK. I see on the console the java application keep looping printing the line every few seconds (as if it is trying to send the data, but can't maybe something is blocking it!). But I don't get any exception or errors on the app (I log all possible errors).

enter image description here

I'm very confused for this behavior. What's going on?!

First, why what I get is dependent on the environment I run my Java app?! If Windows Command Prompt/Powershell shows the character as ?, I expect it just showing it locally like that. Why should I see it also as ? on my browser?! Java app must be independent of the environment.

And second, what is going on with that error on the Linux/Eclipse envrionment when requesting a line that has this character?


Solution

  • The issue as could be predicted, was related to getBytes() and UTF-8 String representations. Did the following and it was all good then:

            he.sendResponseHeaders(200, response.getBytes("UTF-8").length);
            OutputStream os = he.getResponseBody();
            os.write(response.getBytes("UTF-8"));