Search code examples
javajarhashmapcontainskey

Encoding Path to URI behaves differently when built into JAR or not


I have a call to a HashMap's containsKey method that I expect to return true. It returns true if I compile my program to .class files and run it like that, but if I build a JAR then the same call returns false for reasons I cannot comprehend.

I've debugged both the .class and the JAR versions (the JAR using a remote connection as described in http://www.eclipsezone.com/eclipse/forums/t53459.html ) and in both cases the HashMap appears to contain the Key I'm trying to check.

The HashMap uses URI objects as keys. Here are the contents of the variables as shown in each debug session:

When run as a .class file

HashMap Key: java.net.URI = file:/E:/SSD%20App%20Libraries/Google%20Drive/Programming/Bet%20Matching/Java%20Sim/target/classes/simfiles/paytables/

URIToCheck: java.net.URI = file:///E:/SSD%20App%20Libraries/Google%20Drive/Programming/Bet%20Matching/Java%20Sim/target/classes/simfiles/paytables/

Result: GameTreeItemsMap.containsKey(URIToCheck) is true

When run as a JAR

HashMap Key: java.net.URI = jar:file:/E:/SSD%20App%20Libraries/Google%20Drive/Programming/Bet%20Matching/Java%20Sim/out/artifacts/JavaSim_jar/Java%20Sim.jar!/simfiles/paytables/

URIToCheck: java.net.URI = jar:file:///E:/SSD%2520App%2520Libraries/Google%2520Drive/Programming/Bet%2520Matching/Java%2520Sim/out/artifacts/JavaSim_jar/Java%2520Sim.jar!/simfiles/paytables/

Result: GameTreeItemsMap.containsKey(URIToCheck) is false

I would expect the method to return true in both cases. Do URIs somehow behave differently inside a JAR? What's going on?

Thanks in advance for any help!

Edit 1 As was pointed out to me, the URIToCheck in the JAR case is being double encoded (%2520 instead of %20). Here's the code that generates the URIToCheck. I use the walkFileTree method.

        Files.walkFileTree(paytableHomePath, new SimpleFileVisitor<Path>(){
            @Override
            public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) {

                Path parentPath = dir.getParent();

                URI parentURIToCheck = parentPath.toUri();

                boolean testContain = GameTreeItemsMap.containsKey(parentURIToCheck);

                return FileVisitResult.CONTINUE;
            }

In the JAR case, the URI parentURIToCheck is double encoded (with %5250 for a space where there should be %20) and in the .class case this does not happen. Any idea why?


Solution

  • This has nothing to do with HashMap. It appears you've found a bug in Java's Zip File System Provider, namely that it converts a Path to a doubly-encoded URI.

    I can't find an existing bug for it, so I've submitted one. (There is this related bug, whose fix I suspect is the cause of this one.) Update: This is Java bug 8131067.

    Here's the program I wrote to demonstrate the problem, in Java 1.8.0_45-b14. Pass a .jar file with one or more spaces in its path as the first command line argument.

    import java.util.Map;
    import java.util.Collections;
    import java.net.URI;
    import java.io.IOException;
    import java.nio.file.FileSystem;
    import java.nio.file.FileSystems;
    import java.nio.file.Path;
    import java.nio.file.Paths;
    
    public class JarPathTest {
        public static void main(String[] args)
        throws IOException {
    
            Path zip = Paths.get(args[0]);
    
            URI zipURI = URI.create("jar:" + zip.toUri());
            System.out.println(zipURI);
    
            Map<String, String> env = Collections.emptyMap();
    
            try (FileSystem fs = FileSystems.newFileSystem(zipURI, env)) {
                Path root = fs.getPath("/");
                System.out.println(root.toUri());
            }
        }
    }
    

    You can work around it by treating the decoded URI as if it's still percent encoded:

    parentURIToCheck = URI.create(
        parentURIToCheck.getScheme() + ":" +
        parentURIToCheck.getSchemeSpecificPart());