Search code examples
haskelltypesmegaparsec

megaparsec: how to declare the type of `eol` to parse Text and not [Char]


I can't understand what does the type of (for example) eol mean:

eol :: (MonadParsec e s m, Token s ~ Char) => m String

or, better, I don't understand how to use eol with Text.Megaparsec.Text and not Text.Megaparsec.String.

I've been trying to use learn how to use Megaparsec following the (old) tutorial for Parsec from Real World Haskell (I actually started reading RWH tutorial first before finding out that Megaparsec existed). I rewrote the code of the first example to use Megaparsec (see below). But I found that when I try to force the type of eol to Parser Text the compiler throws the error: Couldn't match type ‘[Char]’ with ‘Text’, what I gather from this is that I cannot use eol with Text or, more likely, I don't know how to change that Token s ~ Char context from the eol declaration to use Token Text.

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE NoImplicitPrelude #-}

module CSVParser (
  module CSVParser
) where

import Foundation
import Data.Functor.Identity (Identity)
import Text.Megaparsec
import Text.Megaparsec.Text
import Data.Text

csvFile :: Parser [[Text]]
csvFile =
    do result <- many line
       eof
       return result

line :: Parser [Text]
line =
    do result <- cells
       --eol :: Parser Text -- uncommenting this line results in a compilation error
       eol
       return result

cells :: Parser [Text]
cells =
    do first <- cellContent
       next <- remainingCells
       return (first : next)

remainingCells =
    (char ',' >> cells)
    <|> return []

cellContent :: Parser Text
cellContent = fromList <$> many (noneOf [',','\n'])

parseCSV :: Text -> Either (ParseError (Token Text) Dec) [[Text]]
parseCSV = parse csvFile "(unknown)"

Solution

  • In the type:

    eol :: (MonadParsec e s m, Token s ~ Char) => m String
    

    the ~ is a type equality constraint, and the MonadParsec and Token typeclasses are defined by Megaparsec. They can roughly be interpreted as follows:

    • MonadParsec e s m is an assertion that type m is a monadic parser that reads a Stream of type s and represents errors using an ErrorComponent of type e
    • Token s is the underlying type of the tokens read from stream s

    So, the full type can be interpreted as: eol is a monadic parser with "return value" String that parses a stream whose tokens are Char.

    For your problem, most of this can be ignored. The issue you're running into is that eol returns a String value as the result of the parse, and a String isn't a Text, so you can't make an eol (which is of type Parser String) be of type Parser Text, no matter how hard you try.

    Two solutions are to ignore the unwanted String return value or, if you need it as text, convert it:

    Data.Text.pack <$> eol