Search code examples
haskellmemory-leakstype-systems

Using "type" for a type synonym causes memory leak?


I have the following code as my hot loop.

{-# LANGUAGE BangPatterns #-}
module Simulation.Simulator where

import Simulation.Common ()
import System.Random (RandomGen)
import Control.Monad.Random (Rand)

simulateUntil :: 
        (RandomGen g) =>
        Int                             ->      --Number of simulation steps
        a                               ->      --Initial model
        (a -> Rand g Bool)              ->      --Function to check if simulation should end
        (Float -> a -> Rand g a)        ->      --Update model one step
        Rand g a 
simulateUntil !num !model !done !update = do
        !stop <- done model
        if stop then return model 
        else do updateM <- update (1 / fromIntegral num) model
                simulateUntil num updateM done update

To try and make this loop more readable and more inline with the rest of my code I have added a type synonym in my Simulation.Common code:

type SearchEnv a = (RandomGen g) => Rand g a

I then changed the loop above to use this new type synonym which is used in all of my other code, the new loop is almost identical:

{-# LANGUAGE BangPatterns #-}
module Simulation.Simulator where

import Simulation.Common

simulateUntil ::
        Int                             ->      --Number of simulation steps
        a                               ->      --Initial model
        (a -> SeachEnv Bool)              ->      --Function to check if simulation should end
        (Float -> a -> SearchEnv a)        ->      --Update model one step
        SearchEnv a 
simulateUntil !num !model !done !update = do
        !stop <- done model
        if stop then return model 
        else do updateM <- update (1 / fromIntegral num) model
                simulateUntil num updateM done update

Yet for some reason this last edition leaks memory, which show up as

FUN_1_0

when running with the "-h" option in GHC.

Is this expected behavior with "type" or is there something else happening?

Edit: Here is the difference in memory usage as reported by GHC "-h" option: With type synonym: https://i.sstatic.net/N0kpC.png After removing type synonym(reverting to the old code show at the top): https://i.sstatic.net/A4oZC.png


Solution

  • Internally, GHC represents type class constraints as function arguments in the form of a type class dictionary. So if you have a type RandomGen g => Rand g a, it effectively gets turned into RandomGen -> Rand a. What this means is that whenever the result of a call to done or update is used, the internal type class function must be recomputed. Incidentally, this also means that the result can't be shared since GHC doesn't automatically memoize functions. Internally, the types of done and update are sort of like this:

    done   :: a          -> (RandomGen -> Rand Bool)
    update :: Float -> a -> (RandomGen -> Rand a)
    

    I think the specific problem is that you're passing the result of update back in to the recursive call and each time that value is needed, the internal function with the type class dictionary must be called.

    In your first version, the RandomGen type dictionary is getting passed to the top-level function, so no extra "hidden" functions that need to get called other than that.

    GHC is usually pretty good at optimizing this kind of thing away, but I suspect that this is the culprit.

    Here's a simpler example of this in action. We can observe the time and memory it takes to compute an expression in GHCI with the :set +s command:

    λ> let fib n = if n <= 1 then 1 else fib (n-1) + fib (n-2)
    λ> let { n :: Num a => a; n = fib 30 }
    λ> let { m :: Int; m = fib 30 }
    λ> :set +s
    λ> m
    1346269
    (2.04 secs, 695163424 bytes)
    λ> m
    1346269
    (0.00 secs, 1073792 bytes)
    λ> 
    λ> n
    1346269
    (2.01 secs, 669035464 bytes)
    λ> n
    1346269
    (2.02 secs, 669032064 bytes)
    

    Here is another example. This is sort of like the fib function except it adds some constant in each recursive call.

    λ> let { fib1 :: (Num a, Ord a, Num b) => Int -> a -> b;                fib1 m n = if n <= 1 then 1 else fromIntegral m + fib1 m (n-1) + fib1 m (n-2) }
    λ> let { fib2 :: Int -> ((Num a, Ord a) => a) -> ((Num a, Ord a) => a); fib2 m n = if n <= 1 then 1 else fromIntegral m + fib2 m (n-1) + fib2 m (n-2) }
    λ> :set +s
    λ> fib1 1 30
    2692537
    (2.59 secs, 993139888 bytes)
    λ> fib2 1 30
    2692537
    (17.98 secs, 7884453496 bytes)
    

    Since m gets turned into a function in the second fib definition, it has to get called every time it is needed so no sharing can occur and this results in time and space leaks.