Search code examples
f#fparsec

how parse the between of when the right could come after a repeating pattern?


How would you use existing FParsec functionality to find a repeating consecutive pattern in the rightmost tag?

It's a legitimate possibility in this context. Pre-parsing + escaping might work, but is there a better solution? Do we need to write a new forward combinator, and if so, what does it look like?

#r"""bin\debug\FParsecCS.dll"""
#r"""bin\debug\FParsec.dll"""

open FParsec

let str = pstring
let phraseEscape = pchar '\\' >>. pchar '"'
let phraseChar = phraseEscape <|> (noneOf "|\"\r\n]")    // <- this right square bracket needs to be removed
let phrase = manyChars phraseChar

let wrapped = between (str"[[") (str"]]".>>newline) phrase 

run wrapped "[[some text]]\n"  // <- works fine

// !! problem
run wrapped "[[array[] d]]\n"    // <- that means we can't make ']' invalid in phraseChar

// !! problem
run wrapped "[[array[]]]\n"      // <- and this means that the first ]] gets match leaving a floating one to break the parser

Solution

  • Sorry to be answering my own question, but...

    See composable function phraseTill, and the pend parser that is passed to it of (notFollowedBy(s"]]]")>>.(s"]]"))

    #r"""bin\debug\FParsecCS.dll"""
    #r"""bin\debug\FParsec.dll"""
    
    open FParsec
    
    let s = pstring
    let phraseChar = (noneOf "\r\n")   
    let phrase = manyChars phraseChar
    /// keep eating characters until the pend parser is successful
    let phraseTill pend = manyCharsTill phraseChar pend
    
    /// when not followed by tipple, a double will truly be the end
    let repeatedTo repeatedPtrn ptrn = notFollowedBy(s repeatedPtrn)>>.(s ptrn) 
    let wrapped = (s"[[")>>.phraseTill (repeatedTo "]]]" "]]")
    run wrapped "[[some text]]]"
    run wrapped "[[some text]]"
    

    NB. if you try this out in FSharp Interactive (FSI), make sure you have at least one "run wrapped" line when you send your text to FSI to be evaluated (ie. right-click 'Execute In Interactive'). The type only gets inferred / pinned on application in this example. We could have provided explicit definitions at the risk of being more verbose.