Search code examples
parsinghaskellparsec

Parser for Quoted string using Parsec


I want to parse input strings like this: "this is \"test \" message \"sample\" text"

Now, I wrote a parser for parsing individual text without any quotes:

parseString :: Parser String
parseString = do
  char '"'
  x <- (many $ noneOf "\"")
  char '"'
  return x

This parses simple strings like this: "test message"

Then I wrote a parser for quoted strings:

quotedString :: Parser String
quotedString = do
  initial <- string "\\\""
  x <- many $ noneOf "\\\"" 
  end <- string "\\\""
  return $ initial ++ x ++ end

This parsers for strings like this: \"test message\"

Is there a way that I can combine both the parsers so that I obtain my desired objective ? What exactly is the idomatic way to tackle this problem ?


Solution

  • This is what I would do:

    escape :: Parser String
    escape = do
        d <- char '\\'
        c <- oneOf "\\\"0nrvtbf" -- all the characters which can be escaped
        return [d, c]
    
    nonEscape :: Parser Char
    nonEscape = noneOf "\\\"\0\n\r\v\t\b\f"
    
    character :: Parser String
    character = fmap return nonEscape <|> escape
    
    parseString :: Parser String
    parseString = do
        char '"'
        strings <- many character
        char '"'
        return $ concat strings
    

    Now all you need to do is call it:

    parse parseString "test" "\"this is \\\"test \\\" message \\\"sample\\\" text\""
    

    Parser combinators are a bit difficult to understand at first, but once you get the hang of it they are easier than writing BNF grammars.