Search code examples
f#fparsec

Parser identifiers and free format text. Can this be done with FParsec?


As a follow-on to: How do I test for exactly 2 characters with fparsec?

I need to parse a string that consists of pairs of identifiers followed by freeform text. I can easily construct a parser that finds the identifiers which are of the form of newline followed by exactly two uppercase characters followed by a space. The freeform text, which is associated with the preceding identifier, is everything following the identifier up to but not including the next identifier.

So for example:

AB Now is the
time for all good
men.
CD Four score and seven years ago EF our.

contains two identifiers AB and CD and two pieces of freeform text

Now is the \ntime for all good men.
Four score and seven years ago EF our.

My problem is I don't know how to construct a parser that would match the freeform text but not match the identifiers. Is this a case where I need to do backtracking?

Can this be done and if so how?


Solution

  • I think notFollowedBy is what you're looking for. This should do the trick:

    // adapted from the other question
    let identifier = skipNewline >>. manyMinMaxSatisfy 2 2 CharParsers.isUpper
    
    let freeform = manyChars (notFollowedBy identifier >>. anyChar)