Search code examples
f#fparsec

fparsec - limit number of characters that a parser is applied to


I have a problem where during the parsing of a stream I get to point where the next N characters need to be parsed by applying a specfic parser multiple times (in sequence).

(stripped down toy) Example:

17<tag><anothertag><a42...
  ^
  |- I'm here

Let's say the 17 indicates that the next N=17 characters make up tags, so I need to repetetively apply my "tagParser" but stop after 17 chars and not consume the rest even if it looks like a tag because that has a different meaning and will be parsed by another parser.

I cannot use many or many1 because that would eat the stream beyond those N characters. Nor can I use parray because I do not know how many successful applications of that parser are there within the N characters.

I was looking into manyMinMaxSatisfy but could not figure out how to make use of it in this case.

Is there a way to cut N chars of a stream and feed them to some parser? Or is there a way to invoke many applications but up to N chars?

Thanks.


Solution

  • You can use getPosition to make sure you don't go past the specified number of characters. I threw this together (using F# 6) and it seems to work, although simpler/faster solutions may be possible:

    let manyLimit nChars p =
        parse {
            let! startPos = getPosition
    
            let rec loop values =
                parse {
                    let! curPos = getPosition
                    let nRemain = (startPos.Index + nChars) - curPos.Index
                    if nRemain = 0 then
                        return values
                    elif nRemain > 0 then
                        let! value = p
                        return! loop (value :: values)
                    else
                        return! fail $"limit exceeded by {-nRemain} chars"
                }
    
            let! values = loop []
            return values |> List.rev
        }
    

    Test code:

    let ptag =
        between
            (skipChar '<')
            (skipChar '>')
            (manySatisfy (fun c -> c <> '>'))
        
    let parser =
        parse {
            let! nChars = pint64
            let! tags = manyLimit nChars ptag
            let! rest = restOfLine true
            return tags, rest
        }
    
    run parser "17<tag><anothertag><a42..."
        |> printfn "%A"
    

    Output is:

    Success: (["tag"; "anothertag"], "<a42...")