Search code examples
programming-languagesassemblyhardware

When machine code is generated from a program how does it translates to hardware level operations?


Like if say the instruction is something like 100010101 1010101 01010101 011101010101. Now how is this translating to an actual job of deleting something from memory? Memory consists of actual physical transistors the HOLD data. What causes them to lose that data is some external signal?

I want to know how that signal is generated. Like how some binary numbers change the state of a physical transistor. Is there a level beyond machine code that isn't explicitly visible to a programmer? I have heard of microcode that handle code at hardware level, even below assembly language. But still I pretty much don't understand. Thanks!


Solution

  • I recommend reading the Petzold book "Code". It explains these things as best as possible without the physics/electronics knowledge.

    Each bit in the memory, at a functional level, HOLDs either a zero or a one (lets not get into the exceptions, not relevant to the discussion), you cannot delete memory you can set it to zeros or ones or a combination. The arbitrary definition of deleted or erased is just that, a definition, the software that erases memory is simply telling the memory to HOLD the value for erased.

    There are two basic types of ram, static and dynamic. And are as their names imply, so long as you dont remove power the static ram will hold its value until changed. Dynamic memory is more like a rechargeable battery and there is a lot of logic that you dont see with assembler or microcode or any software (usually) that keeps the charged batteries charged and empty ones empty. Think about a bunch of water glasses, each one is a bit. Static memory the glasses hold the water until emptied, no evaporation, nothing. Glasses with water lets say are ones and ones without are zeros (an arbitrary definition). When your software wants to write a byte there is a lot of logic that interprets that instruction and commands the memory to write, in this case there is a little helper that fills up or empties the glasses when commanded or reads the values in the glasses when commanded. In the case of dynamic memory, the glasses have little holes in the bottom and are constantly but slowly letting the water drain out. So glasses that are holding a one have to be filled back up, so the helper logic not only responds to the read and write commands but also walks down the row of glasses periodically and fills back up the ones. Why would you bother with unreliable memory like that? It takes twice (four times?) as many transistors for an sram than a dram. Twice the heat/power, twice the size, twice the price, with the added logic it is still cheaper all the way around to use dram for bulk memory. The bits in your processor that are used say for the registers and other things are sram based, static. Bulk memory, the gigabytes of system memory, are usually dram, dynamic.

    The bulk of the work done in the processor/computer is done by electronics that the instruction set or microcode in the rare case of microcoding (x86 families are/were microcoded but when you look at all processor types, microcontrollers that drive most of the everyday items you touch they are generally not microcoded, so most processors are not microcoded). In the same way that you need some worker to help you turn C into assembler, and assembler into machine code, there is logic to turn that machine code into commands to the various parts of the chip and peripherals outside the chip. download either the llvm or gcc source code to get an idea of the percentages of your program being compiled is compared to the amount of software it takes to do that compiling. You will get an idea of how many transistors are needed to turn your 8 or 16 or 32 bit instruction into some sort of command to some hardware.

    Again I recommend the Petzold book, he does an excellent job of teaching how computers work.

    I also recommend writing an emulator. You have done assembler, so you understand the processor at that level, in the same assembler reference for the processor the machine code is usually defined as well, so you can write a program that reads the bits and bytes of the machine code and actually performs the function. An instruction mov r0,#11 you would have some variable in your emulator program for register 0 and when you see that instruction you put the value 11 in that variable and continue on. I would avoid x86, go with something simpler pic 12, msp430, 6502, hc11, or even the thumb instruction set I used. My code isnt necessarily pretty in anyway, closer to brute force (and still buggy no doubt). If everyone reading this were to take the same instruction set definition and write an emulator you would probably have as many different implementations as there are people writing emulators. Likewise for hardware, what you get depends on the team or individual implementing the design. So not only is there a lot of logic involved in parsing through and executing the machine code, that logic can and does vary from implementation to implementation. One x86 to the next might be similar to refactoring software. Or for various reasons the team may choose a do-over and start from scratch with a different implementation. Realistically it is somewhere in the middle chunks of old logic reused tied to new logic.

    Microcoding is like a hybrid car. Microcode is just another instruction set, machine code, and requires lots of logic to implement/execute. What it buys you in large processors is that the microcode can be modified in the field. Not unlike a compiler in that your C program may be fine but the compiler+computer as a whole may be buggy, by putting a fix in the compiler, which is soft, you dont have to replace the computer, the hardware. If a bug can be fixed in microcode then they will patch it in such a way that the BIOS on boot will reprogram the microcode in the chip and now your programs will run fine. No transistors were created or destroyed nor wires added, just the programmable parts changed. Microcode is essentially an emulator, but an emulator that is a very very good fit for the instruction set. Google transmeta and the work that was going on there when Linus was working there. the microcode was a little more visible on that processor.

    I think the best way to answer your question, barring how do transistors work, is to say either look at the amount of software/source in a compiler that takes a relatively simple programming language and converts it to assembler. Or look at an emulator like qemu and how much software it takes to implement a virtual machine capable of running your program. The amount of hardware in the chips and on the motherboard is on par with this, not counting the transistors in the memories, millions to many millions of transistors are needed to implement what is usually few hundred different instructions or less. If you write a pic12 emulator and get a feel for the task then ponder what a 6502 would take, then a z80, then a 486, then think about what a quad core intel 64 bit might involve. The number of transistors for a processor/chip is often advertised/bragged about so you can also get a feel from that as to how much is there that you cannot see from assembler.