Search code examples
parsinghaskellfunctional-programming

How to use answers of parsers when nesting parsers (sequentially)


This is my Parser

data Parser a = MkParser (String -> Maybe (String, a))

This is a parser that parses if a particular predicate holds true.

satisfy :: (Char -> Bool) -> Parser Char
-- takes a predicate (takes a item and gives True or False)
-- Nothing if it's False, parse if it's true
satisfy pred = MkParser sf
    where
        sf (c:cs) | pred c = Just (cs, c)
        sf _ = Nothing

This parser simply gives what I want and doesn't change the input string:

pure :: a -> Parser a
pure a = MkParser sf
    where
        sf inp = Just (inp, a)

Now I have this parser. It takes a parser, and the answer is given to a function that parses based on the answer given.

dsc :: Parser a -> (a -> Parser b) -> Parser b
dsc (MkParser pa) pb = MkParser sf
    where
        sf inp = case pa inp of
            Nothing -> Nothing
            Just (cs, c) -> unParser (pb c) cs


-- this just runs a parser
unParser :: Parser a -> String -> Maybe (String, a)
unParser (MkParser pa) inp = pa inp

Now, I'm trying to parse a string. If it's a letter, then a digit, then I want to give the letter and the digit:

myldV2 :: Parser String
myldV2 = dsc (dsc (satisfy isAlpha) (\x-> satisfy isDigit)) (\y -> pure [x,y])

But it says x is out of range. My solution was to recreate it like this, and it does work (basically I made it an inline parser): Understanding bracket use in Haskell - Parser that depends on previous parser gives error when using brackets

My second solution works too:

lde = dsc p1 f1
    where
        p1 = satisfy isAlpha
        f1 x = dsc p2 f2
            where
                p2 = satisfy isDigit
                f2 y = dsc p3 f3
                    where
                        p3 = char '!'
                        f3 = \_ -> pure [x,y]

But how do I get my first attempted solution to work? This one myldV2 = dsc (dsc (satisfy isAlpha) (\x-> satisfy isDigit)) (\y -> pure [x,y])

I basically want to know what I'm doing wrong and how to fix it so I better understand Haskell and what's going on . Thank you.


Solution

  • Simply inline things in your second solution and you'll get one that looks very much like your first solution. In small steps, here's the starting point:

    lde = dsc p1 f1
        where
            p1 = satisfy isAlpha
            f1 x = dsc p2 f2
                where
                    p2 = satisfy isDigit
                    f2 y = dsc p3 f3
                        where
                            p3 = char '!'
                            f3 = \_ -> pure [x,y]
    

    Inline p3 and f3:

    lde = dsc p1 f1
        where
            p1 = satisfy isAlpha
            f1 x = dsc p2 f2
                where
                    p2 = satisfy isDigit
                    f2 y = dsc (char '!') (\_ -> pure [x,y])
    

    Inline p2 and f2:

    lde = dsc p1 f1
        where
            p1 = satisfy isAlpha
            f1 x = dsc (satisfy isDigit) (\y -> dsc (char '!') (\_ -> pure [x,y]))
    

    Inline p1 and f1:

    lde = dsc (satisfy isAlpha) (\x -> dsc (satisfy isDigit) (\y -> dsc (char '!') (\_ -> pure [x,y])))
    

    If you like, you can make some of the repeated structure here more visible with some artistic whitespace. I'll also swap parens for $, just to avoid the close parens at the end that spoil the extensibility.

    lde = dsc (satisfy isAlpha) $ \x ->
          dsc (satisfy isDigit) $ \y ->
          dsc (char '!') $ \_ ->
          pure [x,y]
    

    Hopefully the connection with the standard do-notation version pops out at you! Your dsc is very like (>>=), and if you actually named it that you could write this to mean the same thing:

    lde = do
        x <- satisfy isAlpha
        y <- satisfy isDigit
        char '!' -- OR, if you want to be explicit about it: _ <- char '!'
        pure [x,y]