Search code examples
javaclassloaderurlclassloader

Java empty path convention, especially that used in ClassLoader.getResources


Today, I was surprised (likely due to inexperience) to find out that you can, and it is actually useful to, pass an empty path (literally an empty string) to ClassLoader.getResources, i.e. ClassLoader.getSystemClassLoader().getResources(""). This, from some testing, returns one or two directories of where my application .class files live (and does not include the directories of 3rd-party packages). (Example usage: Get all of the Classes in the Classpath.)

Presumably, this is because the Java System ClassLoader is one of the three ClassLoaders that loads my own application classes (c.f. http://www.oracle.com/technetwork/articles/javase/classloaders-140370.html), so it's not surprising that URL returned points to the directory of my application class files.

But why, and how, does the empty string achieve this? I did not find it documented. Is this empty path derivative of more common Java convention? It's certainly not Linux - you can't cd into an empty path in bash. I'd appreciate it if someone can help me understand this.

In another note, I noticed that getResources(".") achieves the same thing.

Additions for comment discussion

public class myTest {

    public static void main(String[] args) throws Exception {
        ClassLoader classLoader = ClassLoader.getSystemClassLoader();
        URL[] urls = ((URLClassLoader) classLoader).getURLs();
        for (int n = 0; n < urls.length; n++)
            System.out.println(urls[n]);  //lists external.jar

        Enumeration<URL> roots = classLoader.getResources(".");
        while (roots.hasMoreElements()) {
            URL url = roots.nextElement();
            System.out.println("getResources: " + url); //does not list external.jar
        }
    }
}

Command to execute: java -cp ".:external.jar" myTest


Solution

  • Why does a getResources(String) invocation match all classpath entries that are directories, when given a resource name of "" or "."?

    I could only speculate. For what it's worth, I consider this to be an implementation detail of the particular ClassLoader. The way the "" and "." resource names are treated is nevertheless somewhat intuitive, from a filesystem user's perspective.

    ...and how?

    The default OpenJDK application ClassLoader (also known as the System ClassLoader), sun.misc.Launcher$AppClassLoader, is a URLClassLoader descendant with a URL search path comprising the values of the "java.class.path" system property. Its getResources (getResource as well) method ultimately delegates to sun.misc.URLClassPath$FileLoader.getResource(String, boolean), which does the following:

    url = new URL(getBaseURL(), ParseUtil.encodePath(name, false));
    ...
    file = new File(dir, name.replace('/', File.separatorChar)); // dir is the equivalent of getBaseURL()'s path component
    ...
    if (file.exists()) {
        return new sun.misc.Resource() {
            ...
            public URL getURL() { return url; } // eventually returned by the ClassLoader
        }
    }
    

    Leaving all the URL parsing aside, the resource name is essentially treated as a relative filesystem path, and is "absolutized" against the loader's search path's entries. Therefore a name argument of "" or "." matches the search path entries themselves. In other words, all top-level classpath entries are matched and returned, as if they all resided beneath the same root directory. Note that this does not apply to JAR classpath entries, which are instead handled by sun.misc.URLClassPath$JarLoader.

    Why don't these getResources invocations match JAR classpath entries as well? And why are such classpath entries included in the array returned by URLClassLoader.getURLs()?

    API-wise...
    These are two unrelated methods, each serving a distinct purpose. Sometimes they "just happen" to produce the same or similar output—nowhere however do their specifications imply any form of mutual consistency in behaviour.

    getResources, according to URLClassLoader's concrete definition of the term "resource", is specified to return files, directories, or JAR entries beneath its search path. The fact that it also happens to return the search path entries themselves, when they represent directories, is not addressed by its specification, and should hence be treated as an implementation detail (and perhaps a minor specification violation as well) and not be relied upon. Likewise, the fact that it does not return JAR search path entries, while inconsistent with the former, does not counter its specification.

    getURLs, on the other hand, returns the exact1 search path entries provided at instantiation time.

    Implementation-wise...
    Unlike sun.misc.URLClassPath$FileLoader, which, as seen earlier, resolves the resource name against each search path entry's filesystem path, sun.misc.URLClassPath$JarLoader attempts a direct match via JarFile.getEntry(name), which, for the "", and most likely ".", entry name, obviously fails. But even if both URLClassPath.Loaders were to interpret the resource name in the same manner, things would not work out as intended, because the embedded JAR filesystem does not support the notion of a root directory.

    So how am I supposed to retrieve all classpath entries?

    To do so independently of the system ClassLoader in effect, use something along the lines of

    String[] classPathEntries = System.getProperty("java.class.path").split(File.pathSeparator);
    

    , preferably early on in your main method, before any third-party code is afforded the opportunity to modify the property.

    ClassLoader.getSystemClassLoader() has a return type of java.lang.ClassLoader. How do we know (for sure) that the returned instance is a sun.misc.Launcher$AppClassLoader?

    We really don't. The system class loader is implementation-dependent and replaceable. All we can do, as always, is test, e.g.,

    try {
        ClassLoader sysCl = ClassLoader.getSystemClassLoader();
        // not using single-arg Class.forName, since it would use the ClassLoader of this class,
        // which, in the worst-case scenario of being a non-delegating loader, could attempt to load AppClassLoader itself
        if (Class.forName("sun.misc.Launcher$AppClassLoader", false, sysCl).isAssignableFrom(sysCl.getClass())) {
            // default implementation, _most likely_ a URLClassLoader subclass
        }
        else {
            // System ClassLoader overridden, or not on OpenJDK
        }
    }
    catch (ReflectiveOperationException roe) {
        // most likely not on OpenJDK
    }
    

    , and act accordingly.


    1 This might not always hold true, e.g., when search path entries "overlap" (one is another's parent), or security constraints apply; refer to the source for sun.misc.URLClassPath for the specifics.