Search code examples
haskellmonadsdo-notation

How does Haskell "desugar" getline in this do block?


I've read a few books on Haskell but haven't coded in it all that much, and I'm a little confused as to what Haskell is doing in a certain case. Let's say I'm using getLine so the user can push a key to continue, but I don't really want to interpret that person's input in any meaningful way. I believe this is a valid way of doing this:

main = do
    _ <- getLine
    putStrLn "foo"

I understand the basic gist of what's this is doing. getLine returns an IO String, and putStrLn takes a String and returns IO (), so if I theoretically wanted to print what the user typed into the console, I'd basically utilize the >>= operator from the Monad class. In my case, I believe my code is equivalent to getLine >> putStrLn "foo" since I'm discarding the return value of getLine.

However, what if I do this instead?

main = do
    let _ = getLine
    putStrLn "foo"

In this case, we're setting up a sort of lambda to work with something that will take an IO String, right? I could write a printIOString function to print the user's input and that would work fine. When I'm not actually using that IO String, though, the program behaves strangely... getLine doesn't even prompt me for input; the program just prints out "foo".

I'm not really sure what the "desugared" syntax would be here, or if that would shed some light on what Haskell is doing under the hood.


Solution

  • Let's warm up with a few more complicated examples.

    main = do
        x
        x
        x
        putStrLn "foo"
        where
        x = do
            getLine
    

    What do you expect this to do? I don't know about you, but what I expect is for the program to get three lines and then print something. If we desugar the second do block, we get

    main = do
        x
        x
        x
        putStrLn "foo"
        where x = getLine
    

    Since this is the desugaring of the other one, it behaves the same, getting three lines before printing. There's another line of thought that arrives at the same answer, if you don't find this first one intuitive. "Referential transparency", one of the defining features of Haskell, means exactly that you can replace a "reference" to something (that is, a variable name) with its definition, so the previous program should be exactly the same program as

    main = do
        getLine
        getLine
        getLine
        putStrLn "foo"
    

    if we are taking the equation x = getLine seriously. Okay, so we have a program that reads three lines and prints. What about this one?

    main = do
        x
        x
        putStrLn "foo"
        where x = getLine
    

    Get two lines and print. And this one?

    main = do
        x
        putStrLn "foo"
        where x = getLine
    

    Get one line and then print. Hopefully you see where this is going...

    main = do
        putStrLn "foo"
        where x = getLine
    

    Get zero lines and then print, i.e. just print immediately! I used where instead of let to make the opening example a bit more obvious, but you can pretty much always replace a where block with its let cousin without changing its meaning:

    main = let x = getLine in do
        putStrLn "foo"
    

    Since we don't refer to x, we don't even need to name it:

    main = let _ = getLine in do
        putStrLn "foo"
    

    and this is the desugaring of the code you wrote.