Search code examples
javaunicodeurl-encodingjava-5encodeuricomponent

Java URLDecoder returns?


I have a very basic test which fails on me and I can't figure out why.

Here is my code

System.out.println(URLEncoder.encode("去", "UTF-8")); // result = "%E5%8E%BB"
System.out.println(URLDecoder.decode("%E5%8E%BB", "UTF-8")); result = ?

Why does the second system.out result in a ? I am expecting to see 去 again.

To add to the larger picture, I will be using encodeURIComponent() in javascript to post my data to a servlet in which I want to use the URLDecoder.decode but I can't even get the above example to work. What am I missing?

UPDATE: Just noticed something strange, when I run the code in a servlet I get the result I described but if I just run it in a main method it works. Here is my servlet code

import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;
import java.net.URLDecoder;
import java.net.URLEncoder;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

public class SaveFile extends BasicServiceServlet {
 public void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException,  IOException {
    //request.setCharacterEncoding("UTF-8"); 
    //response.setContentType("text/html; charset=UTF-8");  
    String DIR = getBaseUrl();

    String project = request.getParameter("project"); 
    String foldername = request.getParameter("foldername"); 
    String filename = request.getParameter("filename");
    String fileContent = (String)request.getParameter("content");
    String ch = (String)request.getParameter("char"); //char = 去
    String pathToFile = DIR + project + "/" + foldername + "/" + filename; 
    System.out.println(URLEncoder.encode("去", "UTF-8")); //reults in %E5%8E%BB
            System.out.println(URLDecoder.decode(ch, "UTF-8")); // results in ?
    System.out.println(URLDecoder.decode("%E5%8E%BB", "UTF-8")); //results in ?
    System.out.println("去".equals(URLDecoder.decode("%E5%8E%BB", "UTF-8"))); //this results in true

    try {
        //writing it to file results in ?
        BufferedWriter out = new BufferedWriter(new FileWriter(pathToFile));
        out.write(URLDecoder.decode(fileContent, "UTF-8"));
        out.close();
        System.out.println("STAT - SaveFile " + filename);
    }catch(IOException e){
        System.out.println("STAT - SaveFile Error");
    }
   }
 }

But running a simple main method works for me

import java.io.UnsupportedEncodingException;
import java.net.URLDecoder;
import java.net.URLEncoder;

 public class test {

public static void main(String[] args) {
    // TODO Auto-generated method stub
    try {
        System.out.println(URLEncoder.encode("去", "UTF-8"));
        System.out.println(URLDecoder.decode("%E5%8E%BB", "UTF-8"));
    } catch (UnsupportedEncodingException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}
}

Solution

  • It is only a display issue of your console. The encoding/decoding works fine, as you can see using the following code:

    System.out.println("去".equals(URLDecoder.decode("%E5%8E%BB", "UTF-8"))); //displays "true"
    

    --edit--

    Your servlet code doesn't work probably because you don't specify the character encoding when constructing the writer, so it uses the default encoding. Use the following instead:

    new OutputStreamWriter(new FileOutputStream(pathToFile), "UTF-8");