Search code examples
compiler-constructionbootstrapping

How was the first ever language compiled


It is a chicken and egg problem. One solution for bootstrapping a compiler for language X is to use a language Y, but then how was the compiler for language Y has to be compiled first?! If you trace this all the way to the time when no compiler existed, then how was the first ever compiler able to compile itself? Please use high level metaphors to assist understandings.


Solution

  • Let's take it as read that a compiler for any high-level language that isn't C can be written in C1

    So we may as well ask: If I have a computer but no compiler, and I want a C compiler, how can I make one?

    You can write a C compiler in the assembly language of your computer, assuming that you have got an assembler for that computer.

    In practice, that would be foolishly hard. More wisely you would write, in assembly code, an intermediate language compiler somewhat more expressive and powerful than the assembler, and then use it to write a somewhat more expressive and powerful one... until you had written a C compiler.

    Each of your progressively more powerul compilers is a program that translates its source-language (which you, as the inventor, have defined) into the assembly language of your computer, and then invokes the assembler (which you already have) to translate the assembly code into the machine code of your computer2.

    And what if you haven't even got an assembler?

    Then you have to write an assembler in the machine code of your computer. Writing a complex program from scratch in machine code is probably something that nobody still living has the chops to do. But before anybody had written an assembler - in machine code - all programs had to be composed in machine code. The first assemblers were developed in the late 1940s. As with your compiler, you would be wise to develop your assembler iteratively: first a rudimentary one, written in machine code; next a more powerful one, written with the rudimentary one...

    The machine code of a computer is the native language of the CPU, so no further translation is needed to turn it into executable code. You just have to someway load the bytes that compose the machine code program into a region of memory and get the processor to load the instruction at the initial address: then the computer is running the program.

    The first C compilers were created pretty much as sketched above. The Development of the C Language is the history as recorded by the inventor of the language, Dennis Ritchie


    [1] Historically, of course, several major high-level language compilers predated C, ~1969-73:

    [2] For modern compilers, this is a large simplification. Read about Intermediate Representation and see, e.g. The Conceptual Structure of GCC