Search code examples
ccompiler-constructionprogramming-languages

Starting off a simple (the simplest perhaps) C compiler?


I came across this: Writing a compiler using Turbo Pascal

I am curious if there are any tutorials or references explaining how to go about creating a simple C compiler. I mean, it is enough if it gets me to the level of making it understand arithmetic operations. I became really curious after reading this article by Ken Thompson. The idea of writing something that understands itself seems exciting.

Why did I put up this question instead of asking Google? I tried Google and the Pascal one was the first link. The rest did no seem relevant and added to that... I am not a CS major (so I still need to learn what all those tools like yacc do) and I want to learn this by doing and am hoping people with more experience are always better at these things than Google. I want to read some article written in the same spirit as the one I listed above but that which highlights at least the bootstrapping phases of building a simple C compiler.

Also, I don't know the best way to learn. Do I start off building a C compiler in C or some other language? Do I write a C compiler or some other language? I feel questions like this are better answered once I have some direction to explore. Any suggestions?

Any suggestions?


Solution

  • A compiler consists of three pieces:

    1. A parser
    2. An abstract syntax tree (AST)
    3. An assembly code generator

    There are lots of nice parser generators that start with language grammars. Maybe ANTLR would be a good place for you to start. If you want to stick to C roots, try lex/yacc or bison.

    There are grammars for C, but I think C in its entirety is complex. You'd do well to start off with a subset of the language and work your way up.

    Once you have an AST, you use it to generate the machine code that you'll run.

    It's doable, but not trivial.

    I'd also check Amazon for books about writing compilers. The Dragon Book is the classic, but there are more modern ones available.

    UPDATE: There have been similar questions on Stack overflow, like this one. Check out those resources as well.