Search code examples
haskellpattern-matchingcombinatoricsdo-notation

Could pattern-matching in do-notation/enumFromTo slow down a Haskell code?


I've been solving a pretty easy problem: generation of all decreasing sequences in length of L, consisting of natural numbers from 1 up to M in lexicographical order. Yet, I ran into a quite strange issue. Take a look:

c :: (Ord a, Num a, Enum a) => a -> a -> [[a]]
c m 1 = map return [1..m]
c m l = do
          n   <- [l..m]
          res <- c (n - 1) (l - 1)
          return $ n:res

c' :: (Ord a, Num a, Enum a) => a -> a -> [[a]]
c' m = helper 1 m where
 helper :: (Ord a, Num a, Enum a) => a -> a -> a -> [[a]]
 helper a b 1 = map return [a..b]
 helper a b l = do
                  n    <- [a..b]
                  True <- return $ (l - 1 <= n)
                  res  <- helper a (n - 1) (l - 1)
                  return (n:res)

So, obviously, those 2 functions do absolutely the same thing (I checked them on a number of tests, they both give correct results on each), but if you try to evaluate c 100 98 and c' 100 98 in GHCi, you will see an enormous difference in time it takes:

c 100 98: around 5 seconds;

c' 100 98: around 70 seconds;

As I've mentioned, the result is the same.

So, I kind of feel uneasy about generating [a..b] every time, yet I did a small bit of asking around, and there was a suggestion that Haskell doesn't pattern-match right off the bat, but delays it due to lazy-evaluations, which causes a formidable amount of extra calls of c'. However, the second theory didn't quite hold: I set a breakpoint in my code, directly from the GHCi command prompt, to monitor the value of n, which showed that delayed pattern-matching wasn't the case.

Could the problem be actually with the enumFromTo function, or is there any other reason?


Solution

  • Changing your True <- return $ (l - 1 <= n) to True <- return $ (l <= n), to match what the first snippet does, equalizes the timings of the two for me (without changing the answer).

    Without this change, your second snippet wastes a lot of time trying to find decreasing sequences of length l among the numbers [1..l-1] (for many different values of l), a doomed task.