Search code examples
javac++assemblyvirtual-machineprocessor

Why do we use the Java virtual machine?


I am trying to wrap my head around the Java Virtual Machine and why it uses bytecode. I know it has been asked so many times, but somehow I couldn't finally make the correct assumption, so I researched many things and decided to explain how I think it works and if it's correct.

I understand that in C++, compiler compiles the source code on the specific (architecture + operating system). So, compiled version of C++ for (x86 + Windows) won't run on any other architecture or operating system.

My assumptions

When Java compiler compiles the source code into bytecode, It doesn't do the compilation depending on architecture or operating system. The source code will always be compiled to the same bytecode if it's compiled on Windows or Mac. Let's say we compiled and now, send the bytecode to another computer (x86 + windows). In order for that computer to run this bytecode, It needs JVM. Now, JVM knows what architecture + operating system it's running on. (x86 + windows). So, JVM will compile bytecode to x86 + Windows and it will produce machine code which can be run by the actual computer now.

So, even though we use Java Virtual Machine, we still run the actual machine code on our operating system and not on the virtual machine. Virtual Machine just helps us to transform bytecode into machine code.

This means that when using Java, the only thing we have to worry about is installing JVM and that's it.

I just always thought that the Virtual Machine is just a computer itself where it would run the code in its own isolated place, but in case of JVM, i don't think that's correct, because I think machine code JVM produces still has to be run on the actual operating system we have.

Do you think my assumptions are correct?


Solution

  • When Java compiler compiles the source code into bytecode, It doesn't do the compilation depending on architecture or operating system. The source code will always be compiled to the same bytecode if it's compiled on Windows or Mac.

    All correct.

    Let's say we compiled and now, send the bytecode to another computer (x86 + windows). In order for that computer to run this bytecode, It needs JVM. Now, JVM knows what architecture + operating system it's running on. (x86 + windows). So, JVM will compile bytecode to x86 + windows and it will produce machine code which can be run by the actual computer now.

    This is mostly correct, but there are a couple things that are a bit "off"

    First of all, to execute the bytecodes on any computer you need a JVM. That includes the computer on which you compiled the bytecodes.

    (It is theoretically possible that a computer could be designed and implemented to execute the JVM bytecode instruction set as its native instruction set. But I don't know if anyone has ever seriously contemplated doing this. It would be pointless. Performance would not be comparable with hardware that you can by for a couple of hundred dollars. The JVM bytecode instruction set is designed to be compact and simple, and it is relatively easy to JIT compile. Not to be executed efficiently.)

    Secondly a typical JVM actually operates in two modes:

    • It starts out executing the bytecodes in software using an interpreter.
    • After a bit, it selectively compiles bytecodes of heavily used methods to the platform's native instruction set and executes the native code. The compilation is done using a JIT compiler.

    Note that the JIT compiler is platform specific.

    So, even though, we use Java Virtual Machine, we still run the actual machine code on our operating system and not on the virtual machine.

    That is correct.

    [The Java] Virtual Machine just helps us to transform bytecode into machine code.

    The JVM actually does a lot more. Things like:

    • Garbage collection
    • Bytecode loading and verification
    • Implementing reflection
    • Providing native code methods for bridging between Java classes and operating system functionality
    • Implementing infrastructure for monitoring, profiling, debugging and so on.

    This means that when using Java, the only thing we have to worry about is installing JVM and that's it.

    Yes. But in a modern JDK there are other alternatives; e.g. jlink will generate an executable that has a cut-down JRE embedded in it so that you don't need to install a JRE. And GraalVM supports ahead of time (AOT) compilation.

    I just always thought that the Virtual Machine is just a computer itself where it would run the code in its own isolated place, but in case of JVM, i don't think that's correct, because I think machine code JVM produces still has to be run on the actual operating system we have.

    Ah yes.

    The term "virtual machine" has multiple meanings:

    • A Java Virtual Machine "executes" Java bytecode ... in the sense above.
    • A Linux or Windows virtual machine is where the user's application and the guest operating system are running under the control of a "hypervisor" operating system. The applications and guest OS use the native hardware to execute instructions, but they don't have full control of the hardware.
    • And there are potentially other shades of meaning.

    If you conflate JVMs with other kinds of virtual machine, you can get yourself in knots. Don't. They are different enough that conflating the concepts in not going to help you understand.