I have just started self-studying the Dragon book of Compiler Design. I am working on a problem that says to design grammar for an expression containing binary +,-,*,/ and unary +,-
I came up with following
E -> E+T | E-T | T
T -> T*P | T/P | P
P -> +S | -S | S
S -> id | constant | (E)
However, there is an obvious flaw in it. According to this grammar, expressions like
1--3
are valid, which is an error in all programming languages I know. Though, expressions like
1+-+3
and
1- -3
must be valid. How can such a grammar be designed?
I believe your problem is with tokenization. You are identifying 1--3
as an error because you think it should be resolved as 1 --3
rather than 1 - -3
, the latter being perfectly valid. So I think you problem comes because when you tokenize the string you are getting:
['1', '-', '-' , '3']
rather than:
['1', '--', '3']