haskell syntax io program-entry-point keyword

What does Haskell's "do" keyword do?

I'm a C++/Java programmer, and I'm trying to learn Haskell (and functional programming in general), and I've been having a rough go at it. One thing I tried was this:

isEven :: Int -> Bool
isEven x =
    if mod x 2 == 0 then True
    else False

isOdd :: Int -> Bool
isOdd x =
    not (isEven x)

main =
    print (isEven 2)
    print (isOdd 2)

But this failed with this error during compilation:

ghc --make doubler.hs -o Main
[1 of 1] Compiling Main             ( doubler.hs, doubler.o )

doubler.hs:11:5: error:
    • Couldn't match expected type ‘(a0 -> IO ()) -> Bool -> t’
              with actual type ‘IO ()’
    • The function ‘print’ is applied to three arguments,
      but its type ‘Bool -> IO ()’ has only one
      In the expression: print (isEven 2) print (isOdd 2)
      In an equation for ‘main’: main = print (isEven 2) print (isOdd 2)
    • Relevant bindings include main :: t (bound at doubler.hs:10:1)
make: *** [all] Error 1

So, I saw some code online with the "do" keyword, so I tried it like this:

isEven :: Int -> Bool
isEven x =
    if mod x 2 == 0 then True
    else False

isOdd :: Int -> Bool
isOdd x =
    not (isEven x)

main = do
    print (isEven 2)
    print (isOdd 2)

And it worked exactly like I thought it should.

What's going on here? Why doesn't the first code snippet work? And what does adding "do" actually do?

PS. I saw something about "monads" on the internet related to the "do" keyword, does that have something to do with this?

Solution

Why doesn't the first code snippet work?

Outside of a do block, line breaks don't have any significance. So your first definition of main is equivalent to main = print (isEven 2) print (isOdd 2), which fails because print only takes one argument.

Now you may wonder why we can't just use line breaks to signify that one function should be called after another. The problem with that is that Haskell is (usually) lazy and purely functional, so functions don't have side-effects and there's no meaningful concept of calling one function after another.

So then how does print work at all? print is a function that takes a string and produces a result of type IO (). IO is a type that represents possibly side-effecting operations. main produces a value of this type and the operations described by that value will then be executed. And while there's no meaningful concept of calling one function after another, there is a meaningful concept of executing one IO value's operation after another one's. For this we use the >> operator, which chains two IO values together.

I saw something about "monads" on the internet related to the "do" keyword, does that have something to do with this?

Yes, Monad is a type class (if you don't know what those are yet: they're similar to interfaces in OO languages), which (among others) provides the functions >> and >>=. IO is one instance of that type class (in OO terms: one type that implements that interface), which uses those methods to chain multiple operations after each other.

The do syntax is a more convenient way of using >> and >>=. Specifically your definition of main is equivalent to the following without do:

main = (print (isEven 2)) >> (print (isOdd 2))

(The extra parentheses aren't necessary, but I added them to avoid any confusion about precedence.)

So main produces an IO value that executes the steps of print (isEven 2), followed by those of print (isOdd 2).