Search code examples
assemblycompilationoperating-systemsystem-callscpu-architecture

What physically happens inside a computer when a piece of code is compiled and run


I'm curious to understand how a piece of code is transformed into something that the computer can understand. In many similar questions, a common answer is to state that the chain from human readable code to machine level instructions follows more or less the compilation chain:

high-level code --> assembly code --> machine binary code

Nevertheless the high frequency of this answer, it still remains a bit of mystery to me what really is going on in the computer. I do understand the basic electronics behind the computers and CPUs to some extent, but this particular case is a bit of mystery still to me.

Say for example that I code the classical "Hello, world!" example in

#include <iostream>
using namespace std;
int main() 
{
    cout << "Hello, world!";
    return 0;
}

which is then transformed into the assembly code, like e.g.

section     .text
global      _start                              ;must be declared for linker (ld)

_start:                                         ;tell linker entry point

    mov     edx,len                             ;message length
    mov     ecx,msg                             ;message to write
    mov     ebx,1                               ;file descriptor (stdout)
    mov     eax,4                               ;system call number (sys_write)
    int     0x80                                ;call kernel

    mov     eax,1                               ;system call number (sys_exit)
    int     0x80                                ;call kernel

section     .data

msg     db  'Hello, world!',0xa                 ;our dear string
len     equ $ - msg                             ;length of our dear string

which is ultimately transformed into the binary sequence

10100010101001111...................

and then "Hello, world!" appears on the screen.

My question thus is: During this process, what is physically in an electric signal level happening in each stage of this process inside the computer?

I understand that my question might be too broad to answer fully so I'd be grateful if you could point out the major physical phenomena that happens inside the computer (e.g. between hard drive, CPU and the RAM).

Also, please let me know if my question is off-topic, since it is not always 100% unambiguous where the border between different disciplines of computer engineering lay. In this case, could you suggest another SE site, where this question might be better suited for, thank you.


Solution

  • A computer has a CPU and RAM and devices that can be attached.

    A compiler translates program (source) code into machine code, where the program code & data are encoded as numbers — see also Instruction Set Architecture.

    An operating system loads (see loader) the machine code into a process, and starts the CPU at its beginning.

    The CPU is hardware that interprets numbers as machine code instructions, and these tell it what to do every step of the machine code program, and at every step, what the next machine code instruction shall be.

    Some machine code instructions tell the computer to load or store data in memory, or, to communicate with a device.  There is also an interrupt mechanism that allows devices to get the attention of the CPU.

    This process is very meta, since the CPU is doing this same interpretation of machine code during compiling, linking, and running the operating system — there are mechanisms like context switching that allow the CPU to switch jobs/programs and play different roles (operating system, user process A, B, etc..).  Except when idle, the CPU is always executing some program, which is to say some sequence of programmed steps as machine code instructions.

    Transistors implement the CPU and the RAM.  Inside the CPU there are functional units for doing, say, addition, subtraction, conditional branching, etc..  These functional units are composed of large numbers of transistors.  The CPU also has register memories and caches.  They all switch values based on the machine code instruction(s) being executed at the moment, as the only job of the CPU is to execute machine code instruction after instruction.

    For example, let's say the machine code program instructs the computer to add two numbers that happen to be in CPU registers and write the answer back to one of them.  The hardware will first fetch the instruction to execute, then decode the numbers in it, extract values from the named registers, feed those to the inputs of an ALU, instruct that ALU to perform an add, then store the ALU's output back into a register — ready for the next machine code instruction.

    We also know that an add can be performed by lots of though very simple boolean logic that transistors can do, see adder.  This same idea of assembling transistors into function units as needed for a CPU to interpret machine code instructions is done for all features of the processor.

    Most digital circuits are built using one single kind of gate, today, the NAND gate.  Gates can be arranged to implement any boolean function.  These transistors are arranged into two broad kinds of circuits: combinational and sequential.  While combinational circuits compute output values solely based on provided inputs, sequential circuits have a feedback loop that enables them to remember things.  Designers alternate combinational circuits and sequential circuits to form the functional units of the CPU.  To oversimplify, registers and memory are made using sequential logic, while ALUs will use combinational logic.

    You can also check out RAM, and how that is constructed also out of transistors.

    We can also note that during any given set of machine code instructions being executed, some of the circuitry goes unused, and thus we don't care whether those transistors flip or flop unless trying to save power; other transistors are expected to simply hold their state as is, such as for registers and RAM that are not involved in current machine code instructions.