Search code examples
ccompiler-constructionkeywordautoautomatic-storage

Why is the 'auto' keyword useful for compiler writers in C?


I'm currently reading "Expert C Programming - Deep C Secrets", and just came across this:

The storage class specifier auto is never needed. It is mostly meaningful to a compiler-writer making an entry in a symbol table — it says "this storage is automatically allocated on entering the block" (as opposed to statically allocated at compiletime, or dynamically allocated on the heap). auto is pretty much meaningless to all other programmers, since it can only be used inside a function, but data declarations in a function have this attribute by default.

I saw that someone asked about the same thing here, but they don't have any answer and the link given in comments only explains why there's such a keyword in C, inherited from B, and the differences with C++11 or pre-C++11.

I'm posting anyway to focus on the part stating that the auto keyword is somehow useful in compiler writing, but what is the idea nor the connection with a symbol table?

I really insist on the fact that I ask only about a potential usage when programming a compiler in C (not coding a C compiler).

To clarify, I asked this question because I'd like to know if there's an example of code where auto can be justified, because the author stated there would be, when writing compilers.

Here the whole point is that I think to have understood auto (inherited from B, where it was mandatory, but useless in C), but I can't imagine any example when using it is useful (or at least not useless).

It really seems that there isn't any reason at all to use auto, but is there any old source code or something like that corresponding to the quoted statements?


Solution

  • Author answer: I just emailed Mr Van der Linden, and here is what he said:

    Yes, I agree with the people who answered on stack overflow. I don't know for certain, because I never used the language B, but it seems highly plausible to me that "auto" ended up in C because it was in B.

    Even when I was professionally kernel and compiler programming in C in the 1980's, I never saw any code that I can recall that used "auto".

    The key takeaway is that the auto keyword doesn't add any extra information, and thus is redundant and unneeded. It was a mistake to bring it into C!

    I also asked for some explanation about what he meant by speaking about compiler writing and symbol table. Here is his response:

    Say you are writing a compiler that will translate C source code into linker objects (object files that can be linked).

    Whenever your lexer (front end of the compiler) finds a sequence of characters that form a user-defined symbol (might be a variable, might be a function name, might be a constant, etc), the compiler will store that name in a table called the "symbol table". It will also store everything else it knows about the symbol - if it is a variable, it will store its type, if a constant it will store the value, if a function it will note that it can be invoked, etc etc. It will also store the scope of the name (the lines of code in which this symbol is known). The symbol table is one of the core data structures of a compiler, and some of it is carried forward into the object file. The object file needs to know any names that are to be addressable by external code objects, so the linker can associate them the use of a name with the object in which it is stored.

    Then later, when the compiler comes across the same name, the compiler looks in the symbol table to see if it knows all about the name already. One of the useful items to store about a name is "where the compiler will allocate storage for it". That storage has to be maintained as long as the symbol remains in scope. So it is useful for the symbol table to know where it should allocate the storage at runtime. I gave 3 examples of different places where a variable might be stored. The "auto" keyword tells the compiler "this is a variable, and you should store this on the stack and its scope is the function it is declared in".

    Only, the compiler doesn't need to be told this, because this is already true for all variables declared within a function. I hope this explanation makes sense.

    I guess I completely misunderstood his statements by thinking that auto may have some usages when writing a compiler in C, in the code dealing with symbol table, but it seems that he meant auto is useless, but C compiler writers must handle it and understand it. I nevertheless asked him to confirm my mistake, and it was indeed a misunderstanding of mine :

    Perhaps the best way to think about this is:

    1. "auto" has no semantic effect in C
    2. we think it came over from B, but don't know for sure.
    3. It conveys info to someone writing a compiler for C code.
    4. But that info is a duplicate of other info that the compile writer has.
    5. So a compiler writer can take note of either piece of info to update the symbol table
    6. Or indeed, they can check that the two pieces of info are consistent, and if not, issue an error message.