Search code examples
cparsinginitializationlanguage-lawyerassignment-operator

How are C declarations actually parsed, based on this interesting discrepancy?


It's fairly common to see a declaration and assignment combined like this:

int beans = a * 2;

or separate, like this

int beans;
beans = a * 2;

My current understanding is that beans can be assigned to because it is an lvalue; it has storage that can be written to. An rvalue, like a * 2 cannot be assigned to since it's just an expression with a value, no storage. For this reason, this is allowed:

int beans;
(beans) = a * 2;

And in fact, any left operand to the assignment which is an lvalue should work. Now, this seems to suggest that int beans; is an expression which is also an lvalue. However, this is not allowed:

(int beans) = a * 2;

In terms of how the C parser is designed, this suggests that declarations are something greater than just an expression with an lvalue. What is going on here?


Solution

  • The statement

    beans = a * 2;
    

    contains many expressions.

    The main expression is the assignment itself: beans = a * 2. That in turn contains two sub-expressions: beans and a * 2. The multiplication expression have sub-expressions itself: a and 2.

    All expressions can be parenthesized, which means the whole statement could look like this instead:

    (beans) = ((a) * (2));
    

    Here all the sub-expressions are parenthesized.

    Now we come to the definition:

    int beans;
    

    That is not an expression. It doesn't contain any sub-expressions. It can not be parenthesized as a whole.

    The definition with initialization, on the other hand:

    int beans = a * 2;
    

    do contain an expression, with sub-expressions. And that is on the right side of the =. So we can write it as:

    int beans = ((a) * (2));
    

    But again, the variable declaration part is not an expression, and can't be parenthesized.


    Also please note that = in a definition is not assignment. It's initialization. There's a semantic difference indicated by the two different terms.


    As mentioned in a comment by Jonathan Leffler. the declarator part of a declaration can be parenthesized.

    For a simple declaration like

    int beans;
    

    it really makes no sense to use it. But for things like pointers to functions or pointers to arrays it makes a large difference.

    Example:

    int *foo(int x);
    

    That declares a function that takes one int arguments and returns a pointer to int. Compare to the following:

    int (*foo)(int x);
    

    That declares a variable, which is a pointer to a function. The function takes one int argument, and returns an int value.