Search code examples
javajvmjit

The compilation and execution of a java program?


I am a beginner in java programming course and so far this is what I have understood about the whole java program being compiled and executed. Stating in brief:-

1) Source code (.java) file is converted into bytecode(.class) (which is an intermediate code) by the java compiler.

2) This bytecode(.class) file is platform independent so wooosh....I can copy it and take it to a different platform machine which has JVM.

3) When I run the bytecode The JVM which is a part of JRE first verifies the bytecode, calls out JIT which at runtime makes the optimizations since it has access to dynamic runtime information.

4) And finally JVM interprets the intermediate code into a series of machine instructions for the processor to execute. (A processor can't execute the bytecode directly since it is not in native code)

Is my understanding correct? Anything that needs to be added or corrected?


Solution

  • Taking each of your points in turn:

    1) This is correct. Java source is compiled by javac (although other tools could do the same thing) and class files are generated.

    2) Again, correct. Class files contain platform-neutral bytecodes. These are loosely an instruction set for a 'virtual' machine (i.e. the JVM). This is how Java implements the "write once, run anywhere" idea it's had since it was launched.

    3) Partially correct. When the JVM needs to load a class it runs a four-phase verification on the bytecodes of that class to ensure that the format of the bytecodes is legal in terms of the JVM. This is to prevent bytecode sequences being generated that could potentially subvert the JVM (i.e. virus-like behaviour). The JVM does not, however, run the JIT at this point. When bytecodes are executed they start in interpreted mode. Each bytecode is converted on the fly to the required native instructions and OS system calls.

    4) This is sort of wrong when combined with point 3.

    Here's the process explained briefly:

    As the JVM interprets the bytecodes of the application it also profiles which groups of bytecodes are being run frequently. If you have a loop that repeatedly calls a method the JVM will notice this and identify that this is a hotspot in your code (hence the name of the Oracle JVM). Once a method has been called enough times (which is tunable), the JVM will call the Just In Time (JIT) compiler to generate native instructions for that method. When the method is called again the native code is used, eliminating the need for interpreting and thus improving the speed of the application. This profiling phase is what leads to the 'warm-up' behaviour of a Java application where relevant sections of the code are gradually compiled into native instructions.

    For OpenJDK based JVMs there are two JIT compilers, C1 and C2 (sometimes called client and server). The C1 JIT will warm-up more quickly but have a lower optimum level of performance. C2 warms-up more slowly but applies a greater level of optimisation to the code, giving a higher overall performance level.

    The JVM can also throw away compiled code, either because it hasn't been used for a long time (like in a cache) or an assumption that the JIT made (called a speculative optimisation) turns out to be wrong. This is called a deopt and results in the JVM going back to interpreted mode, reprofiling the code and potentially recompiling it with the JIT.