Search code examples
haskellmonadsdo-notation

understanding do notation and bindings


I am very new to haskell and I am trying to understand the methodology used to create Monadic parser in this document https://www.cs.nott.ac.uk/~gmh/pearl.pdf

Instead of following it exactly, I am trying to do it a little bit differently in order to understand it correctly, therefore, I ended up with this code

newtype Parser a = Parser (String -> Maybe (a, String))

item :: Parser Char
item = Parser (\cs -> case cs of
            "" -> Nothing
            (c:cs) -> Just (c, cs))

getParser (Parser x) = x

instance Monad Parser where
    return x = Parser (\cs -> Just (x,cs))
    (Parser p) >>= f  = Parser (\cs -> let result = p cs in
                  case result of
                    Nothing -> Nothing
                    Just (c,cs') -> getParser (f c) cs')

takeThreeDropSecond :: Parser (Char, Char)
takeThreeDropSecond = do
    c1 <- item
    item
    c2 <- item
    return (c1, c2)

This seems to be working, but I am having hard time following what is going on in do notation.

For example; in c1 <- item, what is assigned to c1? Is it the function that is contained in Parser type, or result of that computation, or what else? Moreover, second line in do notation is just item, so does it just run item but doesn't assign the result? And finally, what does return (c1,c2) produce? Is it Parser (String -> Maybe ((c1, c2)), String) or just Just (c1, c2)?


Solution

  • The Parser type wraps up a function that can 1) represent failure using Maybe and 2) returns the remaining text that was not parsed through (a, String) along with 3) some value a that was parsed, which can be anything. The monad instance is the plumbing to tie them together. The return implementation creates a Parser around a function that 1) succeeds with Just something, 2) does not modify its input text, and 3) directly passes the value given to it. The >>= implementation takes a parser and a function, then returns a new parser created by first running the p, then based on whether that result passed or failed running f.

    In takeThreeDropSecond, first c1 <- item says "parse the given using item, assign its result to c1, and feed the rest of the input forward". This does not assign the function inside the item parser to c1, it assigns the result of running the function inside item against the current input. Then you reach item, which parses a value using item, doesn't assign it to anything, and feeds the rest of the input forward. Next you reach c2 <- item, which does basically the same thing as the first line, and finally return (c1, c2), which would expand to Parser (\cs -> Just ((c1, c2), cs)). This means that return (c1, c2) has the type Parser (Char, Char). With type annotations it would be

    takeThreeDropSecond :: Parser (Char, Char)
    takeThreeDropSecond = do
        (c1 :: Char) <- (item :: Parser Char)
        (item :: Parser Char)
        (c2 :: Char) <- (item :: Parser Char)
        (return (c1, c2) :: Parser (Char, Char))
    

    Note that the last line of any monadic do block must have the same type as the function it is a member of. Since return (c1, c2) has type Parser (Char, Char), so must takeThreeDropSecond, and vice-versa.