Search code examples
regexscalafastparse

How can I compose a parser to parse a quoted regex in fastparse?


What I wish to parse is any Regex that quoted with double quotes. For example, "([A-Z]+[A-Z]+[C])"

What I have tried so far is the following in Scala using fastparse library:

  def regex[_: P]: P[Unit] = P(AnyChar.rep).log
  def quotedRegex[_: P]: P[Unit] = P("\"" ~ regex ~ "\"").log

  val Parsed.Failure(label, index, extra) = parse(""""str"""", quotedRegex(_))

But this throws exception:

+quotedRegex:1:1, cut
  +regex:1:2, cut
  -regex:1:2:Success(1:6, cut)
-quotedRegex:1:1:Failure(quotedRegex:1:1 / "\"":1:6 ..."\"str\"", cut)
label = "\""
index = 5
trace = TracedFailure((any-character | "\""),(any-character | "\""),Parsed.Failure(Expected "\"":1:6, found ""))

What I understood so far is that regex parser is consuming the last double quote as well. But I am not able to figure out how to avoid that! I presume we need to write a lookahead of some sort and somehow avoid parsing the last character but not sure how to do this.

Please help.


Solution

  • To do negative lookaheads, use !. It will make sure the next character is not a double quote, but doesn't actually consume anything, just like a negative lookahead in normal regex. Then you can match on AnyChar or some other pattern.

    def regex[_: P]: P[Unit] = P((!"\"" ~ AnyChar).rep).log
    

    Here it is running in Scastie.