Search code examples
xtext

How do Syntactic Predicates Work?


The Xtext documentation, such as here: http://www.eclipse.org/Xtext/documentation.html#syntax just seems explain syntactic predicates by giving one example 'dangling else problem'. My naive interpretation of this would be: If you have ambiguous grammar then use => to select the option you want. However the results I am getting suggest that its more complicated than that, is there a better explanation somewhere? To try to understand what is going on I have contrived this simple, but ambiguous, grammar to experiment with (obviously I would not do it like this in real world):

grammar com.euclideanspace.experiment.Mydsl with org.eclipse.xtext.common.Terminals

generate mydsl "http://www.euclideanspace.com/experiment/Mydsl"

Model:
  opt=Option;

Option:
  (ID Option1 ID)
  |
  (ID Option2 ID)
  ;

Option1:
  '=='|'+=';

Option2:
  '=='|'-=';

This gives the following warnings:

warning(200): ../com.euclideanspace.experiment/src-gen/com/euclideanspace/experiment/parser/antlr/internal/InternalMydsl.g:119:1: Decision can match input such as "RULE_ID '==' RULE_ID" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input
warning(200): ../com.euclideanspace.experiment.ui/src-gen/com/euclideanspace/experiment/ui/contentassist/antlr/internal/InternalMydsl.g:176:1: Decision can match input such as "RULE_ID '==' RULE_ID" using multiple alternatives: 1, 2
As a result, alternative(s) 2 were disabled for that input

The grammar is ambiguous because input like 'a == b' can match using either Option1 or Option2. We can remove this warning by adding a syntactic predicate indicated by '=>' before the option that we want to choose for the potentially ambiguous input.

Option:
  (ID Option1 ID)
  |
  =>(ID Option2 ID)
  ;

We could also put the syntactic predicate inside the bracket like this:

Option:
  (ID Option1 ID)
  |
  (=>ID Option2 ID)
  ;

Both these positions work, so which is best? It is not clear to me how the second case works, choosing one ID in preference to another also implies Option2 over Option1. However, if we put the syntactic predicate before Option2 (which would appear to make sense as this is the option we want to choose) then we get the warning below:

Option:
 (ID Option1 ID)
 |
 (ID =>Option2 ID)
 ;

warning(200): ../com.euclideanspace.experiment/src-gen/com/euclideanspace/experiment/parser/antlr/internal/InternalMydsl.g:119:1: Decision can match input such as "RULE_ID '==' RULE_ID" using multiple alternatives: 1, 2 As a result, alternative(s) 2 were disabled for that input

So it is not just a case of putting the syntactic predicate before the option we want to choose. I think I need to understand how the parser scans the grammar so we know where to cut off unwanted options.

Is there an explanation of syntactic predicates which explains the above issues? How can syntactic predicates be hidden by actions?

Martin


Solution

  • The alternatives with the predicate have to be listed before the alternative without the predicate. E.g. your rule Option should rather look like this:

    Option:
        ID =>Option2 ID) 
      | ID Option1 ID;
    

    Please note that the '==' token will never be consumed as part of Option1 in that case, since it'll always be Option2. You may want to refactor your grammar to remove the duplicate branch there which would save you from using predicates in the first place.