Search code examples
javajarjvmclassloader

How does JVM find a class file among a list of JARs?


Based on my understanding, Java launcher finds a class file by searching through classpath, which can include a list of JARs (ref https://docs.oracle.com/javase/8/docs/technotes/tools/findingclasses.html). My question is that given a list of JARs and a class to look for, does Java search each JAR one by one sequentially, or if not, how does it find the right class file? Does it cache anything in the process, e.g. directory structure from JARs? If the list of JARs is very long, could that cause issue in performance or memory?

I did saw this post: How does Java efficiently search jar files for classes?, but there were no references on the default behavior or what optimization is/can be done.


Solution

  • This is rough answer, sorry for that. But it might be better than nothing, so here we go.

    So when you're referring some class that JVM did not load yet, JVM indeed starts looking through class path, iterating over its items, one by one. If you're referring a.b.C, JVM looks for an entry a/b/C.class.

    Classpath could consist of many different things. It's abstracted by classloader actually which can do anything, e.g. download classfile over the Internet or select it from the database. But usually it just looks inside JAR files (or inside directories which are valid classpath entries as well).

    JAR files are ZIP files and ZIP files have header which allows for relatively quick lookup. It's not super performant, so if your classpath is huge, you might get some performance boost with combining it into one uberjar. Also you might get some performance boost if you would replace jars with extracted folders. But don't take it as advice, most likely boost will be negligible.

    After ClassLoader located .class file, it reads its content and "sends" its bytes it to the JVM. It can also alter those bytes before "sending" them to the JVM. So you might achieve some magic things with those magic classloaders. But ordinary classloaders don't do that.

    After JVM got those bytes, it parses them, verifies, may be compiles and in the end this class becomes loaded inside JVM address space in some very optimized for execution form. From now on this class will be looked up "instantly".

    I think that JVM can also unload unused class if it's not referenced in the program, so I suppose that this process could repeat for the same class. But it's not something you should worry about.

    Another thing to keep in mind is that classpath is Java 8 thing. It still could be used in modern Java, but since Java 9 module path appears which could be used with class path in parallel (or as a replacement). It's my understanding that those are similar when it comes down to low level details like I described above, but it's worth to remind, I guess. I think that most programs didn't embrace modules yet, but I don't have statistics.