I try to understand the grammar behind the unary operators in the C language. According to this version of the standard link, in Section 6.5.3 Unary operators have the following syntax:
unary-expression:
...
++ unary-expression
...
This means that something like this is legal form a grammar point of view:
a = ++++b
. However, the gcc compiler provides this error: lvalue required as increment operand
.
I don't really understand why?
According to the standard, ++b
is equivalent to (b += 1)
. This means that a=++++b
should be expanded to a=((b+=1)+=1)
. Why the compiler give the above error?
To explain this fully, it takes us pretty deep into language lawyer-land. I'll try to explain it step by step.
When the pre-processor parses the expression ++++b
it looks for the longest sequence of characters that will form a valid operator, the so-called "maximal munch rule". In this case that means that the code will get treated as if we had written ++ ++b
(rather than ++ + +b
or some such).
As you noticed from the formal syntax, the prefix ++ and -- can be combined with other unary expressions and so we get "right to left associativity", meaning (++(++b)
. However, the C grammar only specifies that syntax-wise it is fine to combine multiple unary operators like this, but if it is valid code or not depends on the individual operators.
From there we will end up pondering something that formal C calls "lvalues", essentially a modifiable memory location, as opposed to temporary results of expressions. For example in case a + b
of two variables, both a
and b
individually are objects and lvalues, but the result of the addition expression is not.
Lets look at a valid expression first. Had you written ++*ptr
then that would be valid C, because the two unary operators in this case specify the following:
C17 6.5.3.1, emphasis mine
Constraints
The operand of the prefix increment or decrement operator shall have atomic, qualified, or unqualified real or pointer type, and shall be a modifiable lvalue.
C17 6.5.3.2, emphasis mine
The unary
*
operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object.
So the prefix ++
expects a modifiable lvalue and the prefix *
delivers that, so ++*ptr
is fine.
However, in case of ++++b
, the result of the first ++b
is not an lvalue. So by adding another ++
on top of the first, we violate the above quoted constraint, meaning that the code is invalid C.
So why isn't the result of ++b
a lvalue? 6.5.3.1 says the following:
The value of the operand of the prefix
++
operator is incremented. The result is the new value of the operand after incrementation. The expression++E
is equivalent to(E+=1)
.
Ok so we have to go dig up the rules of +=
(compound) assignment operators...
We find the relevant explanation in C17 6.5.16, emphasis mine:
An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, but is not an lvalue.
The compiler must therefore present a diagnostic message due to the constraint violation. We didn't violate the syntax rules, but we violated the constraints. Both syntax and constraints are kind of "no exceptions allowed" if you wish to call the code a strictly conforming C program.