I am struggling with a grammar that involves typed expressions as well as variable access. The result type of this access is not ascertainable during parsing and is evaluated in a second step. This evaluation is not a problem, but it seems hard to write unambiguous parser rules.
All operations that work on different types (e.g. compare operators) produce a reduce/reduce
conflict. Clearly, this is because the parser can not decide if "x.a = y.b
" should be parsed as "bool_expr EUQAL bool_expr
" or as "num_expr EUQAL num_expr
" because the type is uncertain. However, the result type of the comp_op
rule is certain (as it is always a boolean value).
Is there any solution to this problem without throwing all type information away during parsing and always check it in the evaluation phase?
Here is a shortened grammar example (using ocamllex and ocamlyacc):
comp_op:
| bool_expr EQUAL bool_expr { T.Equiv (T.Wrapper $1, T.Wrapper $3) }
| num_expr EQUAL num_expr { T.Equiv (T.Wrapper $1, T.Wrapper $3) }
/* the wrapper type basically just wraps the expressions to get a common type */
bool_expr:
| TRUE { T.Bool true }
| FALSE { T.Bool false }
| access { T.BoolAccess $1 }
num_expr:
| NUMBER { T.Num $1 }
| access { T.NumAccess $1 }
access:
/* some more complex rules describing the access to variables */
Thanks for your help.
As ygrek says, you should not try to mix parsing and typing. It's much easier to write your parser with only one syntactic category for expressions, and then to have a separate type-checking pass that will sort that out.
Theoretically, this comes from the fact that the distinctions made by typing rules are much finer than what traditional parsing technologies can express. They have been attempt at specifying typing rules more declaratively using, for example, attribute grammars, but your usual LL/LR technology is certainly not a good fit, it's like parsing nested parentheses with a regular expression.
Finally, you should use menhir instead of ocamlyacc, because it's just better. You will have more readable and expressive grammars (named parameters, parametrized rules...), better error reporting and grammar debugging features.