Search code examples
haskellreferential-transparency

"inject" progress logging/tracing in haskell computation?


I'm picking a specific task to illustrate what I was talking about

Let's say I wanted to find the sum of all the factors of a large number, naively -- by checking every number below it if it was a factor, then adding them together.

In an imperative programming language with no separation between IO and pure computations, you might do something like this

def sum_of_factors(n):
  sum = 0
  for i between 1 and n:
    if (n % i == 0):
      sum += i
  return sum

However if my n is large, I'd end up staring at an empty screen for a long time before the computation finishes. So I add some logging --

def sum_of_factors(n):
  sum = 0
  for i between 1 and n:
    if (i % 1000 == 0):
      print "checking $i..."
    if (n % i == 0):
      print "found factor $i"
      sum += 1
  return sum

and really, this addition is trivial.

Now, if I were to do this in textbook haskell i might do

sum_of_factors :: Int -> Int
sum_of_factors n = foldl' (+) 0 factors
  where
    factors = filter ((== 0) . (mod n)) [1..n]

I run into the same problem as before...for large numbers, I just stare at a blank screen for a while.

But I can't figure out how to inject the same kind of tracing/logging in the Haskell code. i'm not sure, other than maybe re-implementing fold with explicit recursion, to get the same tracing pattern/result as in the imperative impure code.

Is there a faculty in Haskell to make this doable? One that doesn't require refactoring everything?

Thank you


Solution

  • There is a number of possible solutions.

    The simplest one is to alter your function to return stream of events instead of the final result. You sum_of_factors doesn't compile for me, so I'll use a sum function as an example. The idea is to send Left message to show progress, and send Right result when done. Thanks to lazy evaluation, you'll see progress events while the function is working:

    import Control.Monad
    
    sum' :: [Int] -> [Either String Int]
    sum' = go step 0
      where
      step = 10000
      go _ res [] = [Right res]
      go 0 res (x:xs) = Left ("progress: " ++ show x) : go step (res + x) xs
      go c res (x:xs) = go (c - 1) (res + x) xs
    
    main :: IO ()
    main = do
      forM_ (sum' [1..1000000]) $ \event ->
        case event of
          Right res -> putStrLn $ "Result: " ++ show res
          Left str -> putStrLn str
    

    Other (and better from my point of view) solution is to make the function monadic:

    class Monad m => LogM m where
      logMe :: String -> m ()
    
    instance LogM IO where
      logMe = putStrLn
    
    sum' :: LogM m => [Int] -> m Int
    sum' = go step 0
      where
      step = 10000
      go _ res [] = return res
      go 0 res (x:xs) = logMe ("progress: " ++ show x) >> go step (res + x) xs
      go c res (x:xs) = go (c - 1) (res + x) xs
    
    main :: IO ()
    main = sum' [1..1000000] >>= print
    

    or using foldM:

    import Control.Monad
    
    sum' :: LogM m => [Int] -> m Int
    sum' = liftM snd . foldM go (0, 0)
      where
        step = 10000
        -- `!` forces evaluation and prevents build-up of thunks.
        -- See the BangPatterns language extension.
        go (!c, !res) x = do
            when (c == 0) $ logMe ("progress: " ++ show x)
            return $ ((c + 1) `mod` step, res + x)