I have the following code as my hot loop.
{-# LANGUAGE BangPatterns #-}
module Simulation.Simulator where
import Simulation.Common ()
import System.Random (RandomGen)
import Control.Monad.Random (Rand)
simulateUntil ::
(RandomGen g) =>
Int -> --Number of simulation steps
a -> --Initial model
(a -> Rand g Bool) -> --Function to check if simulation should end
(Float -> a -> Rand g a) -> --Update model one step
Rand g a
simulateUntil !num !model !done !update = do
!stop <- done model
if stop then return model
else do updateM <- update (1 / fromIntegral num) model
simulateUntil num updateM done update
To try and make this loop more readable and more inline with the rest of my code I have added a type synonym in my Simulation.Common code:
type SearchEnv a = (RandomGen g) => Rand g a
I then changed the loop above to use this new type synonym which is used in all of my other code, the new loop is almost identical:
{-# LANGUAGE BangPatterns #-}
module Simulation.Simulator where
import Simulation.Common
simulateUntil ::
Int -> --Number of simulation steps
a -> --Initial model
(a -> SeachEnv Bool) -> --Function to check if simulation should end
(Float -> a -> SearchEnv a) -> --Update model one step
SearchEnv a
simulateUntil !num !model !done !update = do
!stop <- done model
if stop then return model
else do updateM <- update (1 / fromIntegral num) model
simulateUntil num updateM done update
Yet for some reason this last edition leaks memory, which show up as
FUN_1_0
when running with the "-h" option in GHC.
Is this expected behavior with "type" or is there something else happening?
Edit: Here is the difference in memory usage as reported by GHC "-h" option: With type synonym: https://i.sstatic.net/N0kpC.png After removing type synonym(reverting to the old code show at the top): https://i.sstatic.net/A4oZC.png
Internally, GHC represents type class constraints as function arguments in the form of a type class dictionary. So if you have a type RandomGen g => Rand g a
, it effectively gets turned into RandomGen -> Rand a
. What this means is that whenever the result of a call to done
or update
is used, the internal type class function must be recomputed. Incidentally, this also means that the result can't be shared since GHC doesn't automatically memoize functions. Internally, the types of done
and update
are sort of like this:
done :: a -> (RandomGen -> Rand Bool)
update :: Float -> a -> (RandomGen -> Rand a)
I think the specific problem is that you're passing the result of update
back in to the recursive call and each time that value is needed, the internal function with the type class dictionary must be called.
In your first version, the RandomGen
type dictionary is getting passed to the top-level function, so no extra "hidden" functions that need to get called other than that.
GHC is usually pretty good at optimizing this kind of thing away, but I suspect that this is the culprit.
Here's a simpler example of this in action. We can observe the time and memory it takes to compute an expression in GHCI with the :set +s
command:
λ> let fib n = if n <= 1 then 1 else fib (n-1) + fib (n-2)
λ> let { n :: Num a => a; n = fib 30 }
λ> let { m :: Int; m = fib 30 }
λ> :set +s
λ> m
1346269
(2.04 secs, 695163424 bytes)
λ> m
1346269
(0.00 secs, 1073792 bytes)
λ>
λ> n
1346269
(2.01 secs, 669035464 bytes)
λ> n
1346269
(2.02 secs, 669032064 bytes)
Here is another example. This is sort of like the fib
function except it adds some constant in each recursive call.
λ> let { fib1 :: (Num a, Ord a, Num b) => Int -> a -> b; fib1 m n = if n <= 1 then 1 else fromIntegral m + fib1 m (n-1) + fib1 m (n-2) }
λ> let { fib2 :: Int -> ((Num a, Ord a) => a) -> ((Num a, Ord a) => a); fib2 m n = if n <= 1 then 1 else fromIntegral m + fib2 m (n-1) + fib2 m (n-2) }
λ> :set +s
λ> fib1 1 30
2692537
(2.59 secs, 993139888 bytes)
λ> fib2 1 30
2692537
(17.98 secs, 7884453496 bytes)
Since m
gets turned into a function in the second fib
definition, it has to get called every time it is needed so no sharing can occur and this results in time and space leaks.