Search code examples
binaryoperating-systemcpucpu-architecture

How do machines interpret binary?


I was just thinking, how do machines interpreter binary code? All I understand is your code get's turned into 1 and 0's so the machine can understand them, but how do they do that? Is it just a normal text to binary translation?


Solution

  • First, "binary" doesn't mean what you think it means (any data on the computer, including text is already binary, it just the way we decide to display and handle is different).

    Second, compilation is not a simple transformation to funny characters (if it were, we wouldn't need different compilers for different languages). To actually have some understanding of machine code, you need to understand the architecture that it targets. There are many computer architectures, your PC is just one of them. It is a very broad subject and needs firm understanding of computer architecture to grasp.

    I will show an example of a MIPS instructions. If you are interested, you can read on and get some actual knowledge about the subject, try the links at the end of my post.

    MIPS is a popular introductory subject because its instruction format is one of the more digestible ones. MIPS instructions are 32 bit wide. There are 3 kinds of instructions in MIPS: "R", "I" and "J". We will take a look at the "I" instructions.

    When the processor gets an instruction (32 bits of data) it reads it and decides what to do with it. "I" instructions look like this:

    |------|-----|-----|----------------|
     opcode   rs    rt    immediate
       6      5     5     16               (the numbers show how wide are each part)
    

    The meaning of these:

    • opcode tells what kind of instruction this is (for example: addition, subtraction, multiplication and many others). All instructions (including "R" and "J" types) start with the 6-bit opcode, and that's how the processor knows which kind it is.
    • rs and rt are registers, a kind of storage in the processor that can hold 32 bit values. MIPS has 32 of these and they are identified by their number. This is not the same as memory, it's inside the CPU itself.
    • immediate is a number. It is called that because the number is "right there" in the instruction, not in a register or memory.

    A concrete example of adding an immediate to a number stored in a register:

    001000 00001 00010 0000000000000011
    

    In this example, I broke the instruction into parts as above. The meaning of the values is the following:

    • opcode: 001000 means addi or "add immediate".
    • rs: 00001 is 1 in decimal, so this part of the instruction tells the processor that we want to use register 1 as rs.
    • rd: 00010 is 2 in decimal, same idea as with rs.
    • immediate: 0000000000000011 is 3 in decimal.

    The addi instruction works like this: it takes the value found in rs and adds the immediate value to it. After that it puts the outcome into rd. So, when the instruction is done, rd will contain 3+2=5.

    In a nutshell, compilers parse your text and generate instructions to the target processor that does the same thing that you intended to do with your program. As you can see, there is a huge gap between the textual representation of the program that us programmers write, and the runnable machine code.

    A few useful resources on MIPS and computer architecture: