I need to make each instance of Sphere
get a unique identifier so that no two Sphere
s are equal. I won't know ahead of time how many spheres I'll need to make so will need to make them one at a time, but still increment the identifier.
Most solutions I've tried have this issue where I end up with an IO a
and need the unsafePerformIO
to get the value.
This code comes close, but the resulting identifier
is always the same:
module Shape ( Sphere (..)
, sphere
, newID
) where
import System.Random
import System.IO.Unsafe (unsafePerformIO)
data Sphere = Sphere { identifier :: Int
} deriving (Show, Eq)
sphere :: Sphere
sphere = Sphere { identifier = newID }
newID :: Int
newID = unsafePerformIO (randomRIO (1, maxBound :: Int))
This would work as well, and works great in the REPL, but when I put it in a function, it only returns a new value the first time and the same value after that.
import Data.Unique
sphere = Sphere { identifier = (hashUnique $ unsafePerformIO newUnique) }
I know think this all leads to the State Monad, but I don't understand that yet. Is there no other way that will "get the job done", without biting off all the other monad stuff?
First of all, don’t use unsafePerformIO
here. It doesn’t do what you want anyway: it doesn’t “get the a
out of an IO a
”, since an IO a
doesn’t contain an a
; rather, unsafePerformIO
hides an IO action behind a magical value that executes the action when somebody evaluates the value, which could happen multiple times or never because of laziness.
Is there no other way that will "get the job done", without biting off all the other monad stuff?
Not really. You’re going to have to maintain some kind of state if you want to generate unique IDs. (You may be able to avoid needing unique IDs altogether, but I don’t have enough context to say.) State can be handled in a few ways: manually passing values around, using State
to simplify that pattern, or using IO
.
Suppose we want to generate sequential IDs. Then the state is just an integer. A function that generates a fresh ID can simply take that state as input and return an updated state. I think you’ll see straight away why that’s too simple, so we tend to avoid writing code like this:
-- Differentiating “the next-ID state” from “some ID” for clarity.
newtype IdState = IdState Id
type Id = Int
-- Return new sphere and updated state.
newSphere :: IdState -> (Sphere, IdState)
newSphere s0 = let
(i, s1) = newId s0
in (Sphere i, s1)
-- Return new ID and updated state.
newId :: IdState -> (Id, IdState)
newId (IdState i) = (i, IdState (i + 1))
newSpheres3 :: IdState -> ((Sphere, Sphere, Sphere), IdState)
newSpheres3 s0 = let
(sphere1, s1) = newSphere s0
(sphere2, s2) = newSphere s1
(sphere3, s3) = newSphere s2
in ((sphere1, sphere2, sphere3), s3)
main :: IO ()
main = do
-- Generate some spheres with an initial ID of 0.
-- Ignore the final state with ‘_’.
let (spheres, _) = newSpheres3 (IdState 0)
-- Do stuff with them.
print spheres
Obviously this is very repetitive and error-prone, since we have to pass the correct state along at each step. The State
type has a Monad
instance that abstracts out this repetitive pattern and lets you use do
notation instead:
import Control.Monad.Trans.State (State, evalState, state)
newSphere :: State IdState Sphere
newSphere = do
i <- newId
pure (Sphere i)
-- or:
-- newSphere = fmap Sphere newId
-- newSphere = Sphere <$> newId
-- Same function as before, just wrapped in ‘State’.
newId :: State IdState Id
newId = state (\ (IdState i) -> (i, IdState (i + 1)))
-- Much simpler!
newSpheres3 :: State IdState (Sphere, Sphere, Sphere)
newSpheres3 = do
sphere1 <- newSphere
sphere2 <- newSphere
sphere3 <- newSphere
pure (sphere1, sphere2, sphere3)
-- or:
-- newSpheres3 = (,,) <$> newSphere <*> newSphere <*> newSphere
main :: IO ()
main = do
-- Run the ‘State’ action and discard the final state.
let spheres = evalState newSpheres3 (IdState 0)
-- Again, do stuff with the results.
print spheres
State
is what I would reach for normally, since it can be used within pure code, and combined with other effects without much trouble using StateT
, and because it’s actually immutable under the hood, just an abstraction on top of passing values around, you can easily and efficiently save and roll back states.
If you want to use randomness, Unique
, or make your state actually mutable, you generally have to use IO
, because IO
is specifically about breaking referential transparency like that, typically by interacting with the outside world or other threads. (There are also alternatives like ST
for putting imperative code behind a pure API, or concurrency APIs like Control.Concurrent.STM.STM
, Control.Concurrent.Async.Async
, and Data.LVish.Par
, but I won’t go into them here.)
Fortunately, that’s very similar to the State
code above, so if you understand how to use one, it should be easier to understand the other.
With random IDs using IO
(not guaranteed to be unique):
import System.Random
newSphere :: IO Sphere
newSphere = Sphere <$> newId
newId :: IO Id
newId = randomRIO (1, maxBound :: Id)
newSpheres3 :: IO (Sphere, Sphere, Sphere)
newSpheres3 = (,,) <$> newSphere <*> newSphere <*> newSphere
main :: IO ()
main = do
spheres <- newSpheres3
print spheres
With Unique
IDs (also not guaranteed to be unique, but unlikely to collide):
import Data.Unique
newSphere :: IO Sphere
newSphere = Sphere <$> newId
newId :: IO Id
newId = hashUnique <$> newUnique
-- …
With sequential IDs, using a mutable IORef
:
import Data.IORef
newtype IdSource = IdSource (IORef Id)
newSphere :: IdSource -> IO Sphere
newSphere s = Sphere <$> newId s
newId :: IdSource -> IO Id
newId (IdSource ref) = do
i <- readIORef ref
writeIORef ref (i + 1)
pure i
-- …
You’re going to have to understand how to use do
notation and functors, applicatives, and monads at some point, because that’s just how effects are represented in Haskell. You don’t necessarily need to understand every detail of how they work internally in order to just use them, though. I got pretty far when I was learning Haskell with some rules of thumb, like:
A do
statement can be:
An action: (action :: m a)
Often m ()
in the middle
Often pure (expression :: a) :: m a
at the end
A let
binding for expressions: let (var :: a) = (expression :: a)
A monadic binding for actions: (var :: a) <- (action :: m a)
f <$> action
applies a pure function to an action, short for do { x <- action; pure (f x) }
f <$> action1 <*> action2
applies a pure function of multiple arguments to multiple actions, short for do { x <- action1; y <- action2; pure (f x y) }
action2 =<< action1
is short for do { x <- action1; action2 x }