Search code examples
performancehaskellmemorylazy-evaluationlazy-sequences

Does this Haskell code reuse previous computation?


I have a piece of Haskell code that computes regular numbers, i.e. positive integers whose only prime factors can be 2, 3 or 5. The algorithm is straightforward and follows what's suggested in the same Wikipedia article.

regularSeq :: [Integer]
regularSeq = 1 : union timesTwo (union timesThree timesFive)
  where
    timesTwo   = map (* 2) regularSeq
    timesThree = map (* 3) regularSeq
    timesFive  = map (* 5) regularSeq

union :: (Ord a) => [a] -> [a] -> [a]
union [] ys = ys
union xs [] = xs
union (x : xs) (y : ys)
  | x < y     = x : union      xs (y : ys)
  | x > y     = y : union (x : xs)     ys
  | otherwise = x : union      xs      ys

Example:

ghci> takeWhile (<= 60) regularSeq 
[1,2,3,4,5,6,8,9,10,12,15,16,18,20,24,25,27,30,32,36,40,45,48,50,54,60]

I have a couple questions regarding the performance of this code, specifically the lazy evaluation, "memoization" and memory usage.

  1. Since the computation of a new number in the sequence relies on previous values further back, are older values of regularSeq "cached"/"memoized" and reused in the computation of timesTwo/timesThree/timesFive? Or does the recursive code spawn an inefficient degree-3 tree of computations similar to the naive Fibonacci implementation?

    fib 0 = 1
    fib 1 = 1
    fib n = fib (n-1) + fib (n-2)
    
  2. During the evlauation of regularSeq, is there only a single list of integers present in memory, with timesTwo/timesThree/timesFive acting as pointers to different elements inside this same list? Or do they point to independent lists, thus not sharing computation?

    In my mind, timesTwo/timesThree/timesFive simply "lag behind" and reuse the values already discovered by the evaluation of regularSeq, however I'm not entirely sure this is correct.

  3. If I were to implement the sequence in an imperative language (say C or Rust) as an endless stream, I would keep in memory only the values from the head of timesFive to the current value, as the older ones are no longer needed to compute further elements. Is the Haskell garbage collector able to see that older values are not reachable anymore and does it deallocate them? Or does it deallocate the entire sequence only when it is discarded in its entirety?

  4. I find it quite hard to reason about the memory behavior of Haskell programs, and it is often not apparent to me whether the result of computation is shared or something need to be unnecessarily re-evaluated. What are some general principles and a good framework to reason about this problem?


Solution

  • Since the computation of a new number in the sequence relies on previous values further back, are older values of regularSeq "cached"/"memoized" and reused in the computation of timesTwo/timesThree/timesFive? Or does the recursive code spawn an inefficient degree-3 tree of computations similar to the naive Fibonacci implementation?

    They are simply stored in the list as soon as they are produced, resulting in an efficient computation. This is unlike your fib example.

    Memoization is a related, but different concept which makes computing f x fast when f x was previously computed before for the same argument x. In your case, regularSeq is not a function, but only a list. No memoization is needed.

    Operationally, you can think of its state as being regularSeq = x0 : x1 : ... : xN : <<to be computed>> where the last tail is a to-be-evaluated expression.

    During the evlauation of regularSeq, is there only a single list of integers present in memory, with timesTwo/timesThree/timesFive acting as pointers to different elements inside this same list? Or do they point to independent lists, thus not sharing computation?

    timesTwo/... are distinct lists, but they indeed reuse the values stored inside regularSeq.

    In my mind, timesTwo/timesThree/timesFive simply "lag behind" and reuse the values already discovered by the evaluation of regularSeq, however I'm not entirely sure this is correct.

    I think this is a good intuition. The idea is that whenever we read the next element from regularSeq, its code demands the next element from timesTwo/... according to how union works. That will in turn access the already-evaluated data in regularSeq (the "lag behind" you mentioned) and produce the result.

    If I were to implement the sequence in an imperative language (say C or Rust) as an endless stream, I would keep in memory only the values from the head of timesFive to the current value, as the older ones are no longer needed to compute further elements. Is the Haskell garbage collector able to see that older values are not reachable anymore and does it deallocate them? Or does it deallocate the entire sequence only when it is discarded in its entirety?

    Garbage collection should indeed make your code work in O(1) memory overhead beyond the data actually stored in regularSeq. This is because in order to generate one more element in regularSeq we only need to evaluate the "next" elements from timesTwo/.... So, at any time, we only have the three "next" elements in memory, plus old ones that will be GC'd soon.

    The optimizer might even be able to avoid generating the lists timesTwo/... since they will be GC'd anyway. To see if this is the case, one must usually check the GHC Core resulting from the optimizer.

    I find it quite hard to reason about the memory behavior of Haskell programs, and it is often not apparent to me whether the result of computation is shared or something need to be unnecessarily re-evaluated. What are some general principles and a good framework to reason about this problem?

    It is indeed hard. In my opinion, understanding Haskell performance is the hardest task in Haskell programming. Since so much optimization is going on underneath the hood, it is hard to guess what is really going on. At best, with experience, one can get some intuition, but it is still a bit of a dark art.

    To improve the intuition, one can try reading the Simon Peyton-Jones book on the implementation of functional languages, even if it's old nowadays. You can also try experimenting with :sprint to observe thunks as they are computed. Or ghc-vis if that still works, and you feel really adventurous.