The documentation for java.net.URI
specifies that
For any URI u that ... and that does not encode characters except those that must be quoted, the following identities also hold...
But what about URIs that do encode characters that don't need to be quoted?
URI test1 = new URI("http://foo.bar.baz/%E2%82%AC123");
URI test2 = new URI(test1.getScheme(), test1.getUserInfo(), test1.getHost(), test1.getPort(), test1.getPath(), test1.getQuery(), test1.getFragment());
assert test1.equals(test2); // blows up
This fails, because what test2
comes out as, is http://foo.bar.baz/€123
-- with the escaped characters un-escaped.
My question, then, is: how can I construct a URI equal to test1
-- preserving the escaped characters -- out of its components? It's no good using getRawPath()
instead of getPath()
, because then the escaping characters themselves get escaped, and you end up with http://foo.bar.baz/%25E2%2582%25AC123
.
Additional notes:
new URI(test1.toString())
is not the answer. Maybe the answer is to do everything with strings and replicate the URI class's ability to parse and construct URIs in my own code, but that seems daft. Updated to add:
Note that the same issue exists with query parameters etc. -- it's not just the path.
I think this hack will work for you:
URI test1 = new URI("http://foo.bar.baz/example%E2%82%AC123");
URI test2 = new URI(test1.getScheme(),
test1.getUserInfo(),
test1.getHost(),
test1.getPort(),
test1.getPath(),
test1.getQuery(),
test1.getFragment());
test2 = new URI(test2.toASCIIString());
assert test1.equals(test2);
System.out.println(test1);
System.out.println(test2);
}
I use an additional step using toASCIIString()