Search code examples
rascal

How to remove ambiguity from this piece of Delphi grammar


I'm working on a Delphi Grammar in Rascal and I'm having some problems parsing its “record” type. The relevant section of Delphi code can look as follows:

record

private 
a,b,c : Integer;
x : Cardinal;

end

Where the "private" can be optional, and the variable declaration lines can also be optional.

I tried to interpret this section using the rules below:

syntax FieldDecl    = IdentList ":" Type 
        | IdentList ":" Type ";"
        ;

syntax FieldSection = FieldDecl
        | "var" FieldDecl
        | "class" "var" FieldDecl
        ;

syntax Visibility = "private" | "protected" | "public"| "published" ; 

syntax VisibilitySectionContent = FieldSection
            | MethodOrProperty
            | ConstSection
            | TypeSection
            ;

syntax VisibilitySection = Visibility? VisibilitySectionContent+    
         ;

syntax RecordType   = "record" "end"
        | "record" VisibilitySection+  "end"
        ;   

Problem is ambiguity. The entire text between “record” and “end” can be parsed in a single VisibilitySection, but every line on its own can also be a seperate VisibilitySection.

I can change the rule VisibilitySection to

syntax VisibilitySection = Visibility 
         | VisibilitySectionContent
         ;

Then the grammar is no longer ambiguous, but the VisibilitySection becomes, flat, there is no nesting anymore of the variable lines under an optional 'private' node, which I would prefer.

Any suggestions on how to solve this problem? What I would like to do is demand a longest /greedy match on the VisibilitySectionContent+ symbol of VisibilitySection.

But changing

syntax VisibilitySection = Visibility? VisibilitySectionContent+    

to

syntax VisibilitySection = Visibility? VisibilitySectionContent+ !>> VisibilitySectionContent

does not seem to work for this.

I also ran the Ambiguity report tool on Rascal, but it does not provide me any insights.

Any thoughts?

Thanks


Solution

  • I can't check since you did not provide the full grammar, but I believe this should work to get your "longest match" behavior:

    syntax VisibilitySection 
      = Visibility? VisibilitySectionContent+ () 
      >> "public" 
      >> "private" 
      >> "published"
      >> "protected"
      >> "end"
      ;
    

    In my mind this should remove the interpretation where your nested VisibilitySections are cut short. Now we only accept such sections if they are immediately followed by either the end of the record, or the next section. I'm curious to find out if it really works because it is always hard to predict the behavior of a grammar :-)

    The () at the end of the rule (empty non-terminal) makes sure we can skip to the start of the next part before applying the restriction. This only works if you have a longest match rule on layout already somewhere in the grammar.