With a little bit of unsafe
, you can see how much of a lazy value has been evaluated in Haskell
import Data.IORef
import System.IO.Unsafe
data Nat = Z | S Nat
deriving (Eq, Show, Read, Ord)
natTester :: IORef Nat -> Nat
natTester ref =
let inf = natTester ref
in unsafePerformIO $ do
modifyIORef ref S
pure $ S inf
newNatTester :: IO (Nat, IORef Nat)
newNatTester = do
ref <- newIORef Z
pure (natTester ref, ref)
howMuchWasEvaled :: (Nat -> b) -> IO Nat
howMuchWasEvaled f = do
(inf, infRef) <- newNatTester
f inf `seq` readIORef infRef
With:
ghci> howMuchWasEvaled $ \x -> x > S (S Z)
S (S (S Z))
Indicating that only the first four constructors of infinity :: Nat
were evaluated.
If x
is used twice, we still get the total evaluation that was needed:
> howMuchWasEvaled $ \x -> x > Z && x > S (S Z)
S (S (S Z))
This makes sense - once we have evaluated x
up to a point, we don't have to start all over again. The thunk has already been forced.
But is there a way of checking how many times constructors were evaluated? I.e., a function magic
that behaves like this:
> magic $ \x -> x > Z
S Z
> magic $ \x -> x > Z && x > Z
S (S Z)
...
I understand this may involve compiler flags (maybe no-cse
), inline pragmas, very unsafe functions, etc.
EDIT: Carl pointed out that I may have been insufficiently clear on the constraints of what I was looking for. The requirement is that the function that magic
is given as an argument cannot be altered (though it can be assumed to be lazy in its argument). magic
would be part of a library that you could call with your own functions.
That said, GHC-specific hacks and things that only work unreliably are definitely still game.
As expressed, that cannot be done in ghc. Two uses of the same name, such as x
in your example, will always be shared with ghc's implementation of haskell's evaluation model. That's a guarantee that provides a key building block for ensuring sharing is happening. At the very least, getting this to do what you want will require passing in multiple values, one for each independent place you want to use the named value.
Then you'll have to ensure that on the calling side, the values aren't accidentally shared before being passed in to the function. This can be done, but it's the place that might require the use of options like -fno-cse
or -fno-full-laziness
, depending on how you implement it and what optimization level ghc is running at.
Here's a small modification of your starting point that works in ghci, at least:
{-# OPTIONS_GHC -fno-full-laziness #-}
import Data.IORef
import System.IO.Unsafe
data Nat = Z | S Nat
deriving (Eq, Show, Read, Ord)
natTester :: IORef Nat -> Nat
natTester ref =
let inf = natTester ref
in unsafePerformIO $ do
modifyIORef ref S
pure $ S inf
newNatTester :: IO ((a -> Nat), IORef Nat)
newNatTester = do
ref <- newIORef Z
pure (\x -> x `seq` natTester ref, ref)
howMuchWasEvaled :: ((a -> Nat) -> b) -> IO Nat
howMuchWasEvaled f = do
(infGen, infRef) <- newNatTester
f infGen `seq` readIORef infRef
Use in ghci:
*Main> howMuchWasEvaled $ \gen -> let x = gen 1 ; y = gen 2 in x > Z && y > Z
S (S Z)
*Main> howMuchWasEvaled $ \gen -> let x = gen 1 in x > Z && x > Z
S Z
I replaced passing a single infinity into the function with passing it an infinity generator. The generator doesn't care what argument it's called with, as long as it's not a bottom value. (The seq
is there to be sure the function actually uses its argument, to prevent a number of optimizations ghc might go for if the argument was unused.) As long as it's called with a different value every time, ghc won't be able to cse it away, because the expressions are different. If used with optimizations, full laziness might interfere by floating natTester ref
out of the lambda in newNatTester
. To prevent that, I have added a pragma to turn off that optimization within this module. It doesn't matter in ghci by default, as it doesn't use optimizations. It might matter if this module gets compiled, so I threw the pragma in just to be sure.