Search code examples
haskellparsec

How to parse a line using Parsec?


I'm using the parsing library Parsec to parse some text. I simply need to parse lines, which are strings of arbitrary characters, ending with a '\n' or an eof when its at the end of the string. When calling parseHS' I get the complaint that Exception: Text.ParserCombinators.Parsec.Prim.many: combinator 'many' is applied to a parser that accepts an empty string..

parseHS' :: String -> Either ParseError [String]
parseHS' input = parse hsFile' "(unknown)" input

hsFile' :: GenParser Char st [String]
hsFile' = do
    many1 line

line :: GenParser Char st String
line = do
    result <- many (noneOf "\n")
    optional newline
    return result

How could this be achieved correctly?


Solution

  • Of course, if you only need to split the input by lines, you could use lines.

    sepEndBy in Parsec does what you want - splits input into a list of parsed entities separated by a given separator, optionally ending with it or eof.

    Your grammar for line permits the parser to produce a never-ending stream of lines for any input. This can be resolved by making the decision about newline externally to line:

    hsFile' = do
            x <- line
            xs <- many $ do
                    newline
                    line
            eof
            return (x:xs)
    
    line = many $ noneOf "\n"
    

    This one will produce a empty line at the end in case the file ends with newline.