Search code examples
coopcompiler-constructionprogramming-languages

user defined data type/operations to CPU instruction set


In any programming environment,what ever the data type I am going to choose finally the CPU will do only the Arithmetic operations(addition/logical operations).

How this transition(from user defined data type/operations to CPU instruction set) happens and what is the role of compiler,interpreter,assembler and linker in this life cycle

Also how OOPS handles this mapping since the worst case mostly all are objects in OOPS(I mean the Java language)..


Solution

  • Java source --> native code translation actually happens in two distinct steps: the conversion from source code to bytecode at compile time (that's what javac does), and the conversion from bytecode to native CPU instructions at runtime (that's what java does).

    When the source code is being "compiled", the fields and methods get condensed into entries in a symbol table. You say "System.out.println()", and javac turns it into something like "get the static field referenced by symbol #2004, and invoke the method referred to by symbol #300 on it" (where #2004 might be "System.out" and #300 might be "void java.io.PrintStream.println()"). (Note, i'm way oversimplifying -- the symbols look nothing like that, and they're split up a bit more. But they do contain that kind of info.)

    At runtime, the JVM looks at those symbols, loads the classes referred to in them, and runs (or generates, if it's JITting) the native instructions necessary to find and execute the method. There's no real "linker" in Java; all the linking is done at runtime, based on the classes referenced. It's a lot like how DLLs work in Windows.

    JIT is about the closest thing there is to an "assembler". It takes the bytecode and generates equivalent native code on the fly. The bytecode isn't in human-readable form, though, so i wouldn't normally count the translation as "assembling".

    ...

    In languages like C and C++ (not C++/CLI), the story is quite different. All of the translation (and a good bit of linking) happens at compile time. Access to members of a struct gets converted into something like "give me the int 4 bytes from the beginning of this particular bunch of bytes". There's no flexibility there; if the struct's layout changes, generally the whole app has to be recompiled.