I have this mutable array in an ST
monad. And I have this loop function.
runST $ do
myarray <- newMArray (Sz 10) 0
loopM_ 0 (<10) (+1) (\j ->
loopM_ 0 (<10) (+1) (\i ->
when (mytruthcheck j i)
(modifyM_ myarray (pure . (+1)) ((funcofji j i) :: Int)
)))
I want to use forkST_
to run the outer loop in parallel like this.
runST $ do
myarray <- newMArray (Sz 10) 0
loopM_ 0 (<10) (+1) (\j ->
void (forkST_ (loopM_ 0 (<10) (+1) (\i ->
when (mytruthcheck j i)
(Data.Massiv.Array.Mutable.modifyM_
myarray (pure . (+1)) ((funcofji j i) :: Int)
))))
But I’m guessing this will cause thread collisions but I don’t really know although I do know it is possible that funcofji
can output the same value for different values of j
and therefore the loop can modify the same index of myarray
for different j
s. Is there a way to ensure this is done atomically or is that already the case?
Btw here’s the loopM_
function
loopM_ :: Monad m => Int -> (Int -> Bool) -> (Int -> Int) -> (Int -> m a) -> m ()
loopM_ !init' condition increment f = go init'
where
go !step
| condition step = f step >> go (increment step)
| otherwise = pure ()
As it was mentioned in the comments, atomic modification is only useful for concurrency which doesn't look like what is needed here. What you need is parallelism., which is available in massiv for Int
s: atomicAddIntArray
There is also a builtin way to do parallelism in massiv
very efficiently, so there is definitely no need to reinvent the wheel:
createArray_ Par (Sz 10) $ \scheduler myarray ->
loopM_ 0 (<10) (+1) $ \j ->
loopM_ 0 (<10) (+1) $ \i ->
when (mytruthcheck j i) $
scheduleWork_ scheduler $
void $ atomicAddIntArray myarray ((funcofji j i) :: Int) 1
Also don't lie to yourself, ST (state thread) was not build for multi-threading, use IO instead. However, if you can guarantee, that despite multi-threaded setup, the outcome that is produced is still deterministic in the end, then it is ok to use unsafePerformIO
.
Edit
I just noticed this comment:
The loop size of j is close to 100,000 and the loop size of i is close to 1 Billion.
Which makes me believe that it will be better to parallelize it in this way instead:
createArray_ Par (Sz 10) $ \scheduler myarray ->
iforSchedulerM_ scheduler (0 ..: (100000 :. 1000000000)) $ \_ (j :. i) ->
when (mytruthcheck j i) $
void $ atomicAddIntArray myarray ((funcofji j i) :: Int) 1
This will ensure that you only schedule a few jobs, instead of billions. Check out iforSchedulerM_
implementation in order to customize it parallelization if you have more insight into your per i
and j
workload.