Search code examples
parsingrustpest

PEG parser in pest to match regex within triple quotes, tried at https://pest.rs/


I am writing a PEG file to be used in pest for our dsl. There is need where i need to parse a key value where value is a regex within triple quote. I am unable to write a pest rule for it.

The value is """Some regex here"""

Rule I defined is:

TQ = {"\"\"\""}

and I need

regex = {TQ ~ Anything but not TQ ~ TQ}

I tried with

regex = {TQ ~(!TQ)* ~ TQ}

which doesn't work and not proper PEG

regex = {TQ ~ ANY* ~ TQ}

which greedily consumes all token even triple quotes at the end

The rule should parse regex inside triple quotes like

 """^\w+\s+\d\d\d\d\-\d\d\-\d\d\s+\d\d\:\d\d\:\d\d\s+AB_02V\s+\d+\s+.*"""

Solution

  • Your definition is very close to be correct with one caveat: the Pest negative predicate doesn't consume any input when succeeded. So the parser may become stuck and can't make progress if you only tell it not to match something.

    It also needs to know what to match. In this case, that would be anything. The Pest has a built-in rule ANY for that exact purpose:

    tq = { "\"\"\"" }
    
    re = { (!tq ~ ANY)* }
    
    regex = { tq ~ re ~ tq }
    

    If you want to dive in more deeply, there's pest book.