Search code examples
javaurluriurl-encoding

Preserving escaped characters when constructing URIs in Java


The documentation for java.net.URI specifies that

For any URI u that ... and that does not encode characters except those that must be quoted, the following identities also hold...

But what about URIs that do encode characters that don't need to be quoted?

URI test1 = new URI("http://foo.bar.baz/%E2%82%AC123");
URI test2 = new URI(test1.getScheme(), test1.getUserInfo(), test1.getHost(), test1.getPort(), test1.getPath(), test1.getQuery(), test1.getFragment());
assert test1.equals(test2); // blows up

This fails, because what test2 comes out as, is http://foo.bar.baz/€123 -- with the escaped characters un-escaped.

My question, then, is: how can I construct a URI equal to test1 -- preserving the escaped characters -- out of its components? It's no good using getRawPath() instead of getPath(), because then the escaping characters themselves get escaped, and you end up with http://foo.bar.baz/%25E2%2582%25AC123.

Additional notes:

  1. Don't ask why I need to preserve escaped characters that in theory don't need to be escaped -- trust me, you don't want to know.
  2. In reality I don't want to preserve all of the original URL, just most of it -- possibly replacing the host, port, protocol, even parts of the path, so new URI(test1.toString()) is not the answer. Maybe the answer is to do everything with strings and replicate the URI class's ability to parse and construct URIs in my own code, but that seems daft.

Updated to add:

Note that the same issue exists with query parameters etc. -- it's not just the path.


Solution

  • I think this hack will work for you:

        URI test1 = new URI("http://foo.bar.baz/example%E2%82%AC123");
    URI test2 = new URI(test1.getScheme(),
                        test1.getUserInfo(),
                        test1.getHost(),
                        test1.getPort(),
                        test1.getPath(),
                        test1.getQuery(),
                        test1.getFragment());
    
    test2 = new URI(test2.toASCIIString());
    
    assert test1.equals(test2);
    
    System.out.println(test1);
    System.out.println(test2);
    

    }

    I use an additional step using toASCIIString()