Search code examples
javaindexingjarresourcesclassloader

JAR indexing and getResources


It appears to me that JAR file indexing breaks the mechanics of ClassLoader.getResources(). Consider the following program:

import java.io.*;
import java.net.*;
import java.util.*;

public class TryIt {
    public static void main(String[] args) throws Exception {
        URL[] urls = {
            (new File("a.jar")).getAbsoluteFile().toURI().toURL(),
            (new File("b.jar")).getAbsoluteFile().toURI().toURL()
        };
        URLClassLoader cl = URLClassLoader.newInstance(urls);
        String[] res = { "foo", "foo/", "foo/arb", "foo/bar", "foo/cab" };
        for (String r: res) {
            System.out.println("'" + r + "':");
            for (URL u: Collections.list(cl.getResources(r)))
                System.out.println(" " + u);
        }
    }
}

Now prepare the JAR files mentioned in that program:

mkdir a/foo b/foo
touch a/foo/arb a/foo/bar b/foo/bar b/foo/cab
echo "Class-Path: b.jar" > mf
jar cfm a.jar mf -C a foo
jar cf b.jar -C b foo

If you run java TryIt, you will get output like this:

'foo':
 jar:file:…/a.jar!/foo
 jar:file:…/b.jar!/foo
'foo/':
 jar:file:…/a.jar!/foo/
 jar:file:…/b.jar!/foo/
'foo/arb':
 jar:file:…/a.jar!/foo/arb
'foo/bar':
 jar:file:…/a.jar!/foo/bar
 jar:file:…/b.jar!/foo/bar
'foo/cab':
 jar:file:…/b.jar!/foo/cab

But if you run jar -i a.jar to create an index, then the same command as above prints this:

'foo':
 jar:file:…/a.jar!/foo
'foo/':
 jar:file:…/a.jar!/foo/
'foo/arb':
 jar:file:…/a.jar!/foo/arb
'foo/bar':
 jar:file:…/a.jar!/foo/bar
'foo/cab':
 jar:file:…/b.jar!/foo/cab

The index itself looks like this:

JarIndex-Version: 1.0

a.jar
foo

b.jar
foo

Doesn't the contract of getResources imply that all available resources matching the given name should be returned?

Finds all the resources with the given name.

Doesn't the JAR File Specification allow indexed packages to span multiple JAR files?

Normally one package name is mapped to one jar file, but if a particular package spans more than one jar file, then the mapped value of this package will be a list of jar files.

Is there some specification somewhere which says that what I'm observing is indeed correct (or at least permissible) behavior?

Is there some workaround to get all named resources despite the index?


Solution

  • This appears to be a bug.
    I've reported it to Oracle, and it's now in their bug database as bug 8150615.


    I did some digging around in the OpenJDK sources and found the reson for this behavior in there.

    The relevant class here is sun.misc.URLClassPath. It contains a (lazily constructed) list of loaders, and queries each loader in turn to assemble its result. However, if a JAR file contains an index, then the JAR files therein will explicitely be excluded from getting added to the list of loaders. Instead, the loader for the JAR containing the index will query said index for the name in question, and traversed the resulting list. But here is the catch: this happens in a method URLClassPath$JarLoader.getResource which returns a single Resource object. It is not possible for this method to return multiple resources. And as all objects in the index are handled by a single loader, a single resource is all we get.