Search code examples
f#fparsec

Skip whitespace and comments with FParsec


I try to skip any whitespace or comment while parsing a programming language.

There are two types of comment I want to skip:

  1. Line comment: ;; skip rest of line
  2. Block comment: (; skip anything between ;)

Example code to parse with comments and whitespaces:

(type (; block comment ;) (func))
(import "env" "g" (global $g (mut i32)))
(func (type 0) ;; line comment
     i32.const 100
     global.set $g)
(export "f" (func 0))

I tried multiple approaches but the parser always breaks somewhere. My idea goes like this:

let comment : Parser<unit, Ctx> = 
    let lineComment  = skipString ";;" >>. skipRestOfLine true
    let blockComment = between (skipString "(;") (skipString ";)") (skipMany anyChar)
    spaces >>. lineComment <|> blockComment

let wsOrComment = attempt comment <|> spaces

I would like the comments to be ignored completely like the spaces are. Any ideas how to accomplish that? (It's my first project with FParsec)


Solution

  • Based on the suggestion by Koenig Lear, I filtered all comments with an regex before running the text through the parser. This is maybe not the nicest option, but it does the job reliable with only two lines of code.

    let removeComments s = 
        let regex = Regex(@"\(;.*;\)|;;.*")
        regex.Replace(s, String.Empty)
    
    let input = """
    (type (; block comment ;) (func))
    (import "env" "g" (global $g (mut i32)))
    (func (type 0) ;; line comment
         i32.const 100
         global.set $g)
    (export "f" (func 0))
    """
    
    let filtered = removeComments input
    
    // parse "filtered" with FParsec