Search code examples
rubyparsingbnftreetop

Treetop grammar line continuation


I'm trying to create a grammar for a language like the following

someVariable = This is a string, I know it doesn't have double quotes
anotherString = This string has a continuation _
      this means I can write it on multiple line _
      like this
anotherVariable = "This string is surrounded by quotes"

What are the correct Treetop grammar rules that parse the previous code correctly?

I should be able to extract the following values for the three variables

  • This is a string, I know it doesn't have double quotes
  • This string has a continuation this means I can write it on multiple line like this
  • This string is surrounded by quotes

Thank you


Solution

  • If you define the sequence "_\n" as if it was a single white-space character, and ensure that you test for that before you accept an end-of-line, your VB-style line continuation should just drop out. In VB, the newline "\n" is not white-space per se, but is a distinct statement termination character. You probably also need to deal with carriage returns, depending on your input character processing rules. I would write the white-space rule like this:

    rule white
      ( [ \t] / "_\n" "\r"? )+
    end
    

    Then your statement rule looks like this:

    rule variable_assignment
      white* var:([[:alpha:]]+) white* "=" white* value:((white / !"\n" .)*) "\n"
    end
    

    and your top rule:

    rule top
        variable_assignment*
    end
    

    Your language doesn't seem to have any more apparent structure than that.