There is a performance issue I am trying to debug as part of a more complicated code. It seems that append
function that I am using to create a dynamic, growable vector of (Int,Int,Int,Int)
is causing one of the Int
in the tuple to be boxed and unboxed before being written to the vector. I wrote a simpler code that reproduces the issue - it seems to happen only when I add the vector growth functionality in append
function - sample code below (it doesn't do much useful work except reproducing the issue), followed by snippets of core
which show the value being boxed and unboxed:
{-# LANGUAGE BangPatterns #-}
module Test
where
import Data.Vector.Unboxed.Mutable as MU
import Data.Vector.Unboxed as U hiding (mapM_)
import Control.Monad.ST as ST
import Control.Monad.Primitive (PrimState)
import Control.Monad (when)
import GHC.Float.RealFracMethods (int2Float)
import Data.STRef (newSTRef, writeSTRef, readSTRef)
import Data.Word
type MVI1 s = MVector (PrimState (ST s)) Int
type MVI4 s = MVector (PrimState (ST s)) (Int,Int,Int,Int)
data Snakev s = S {-# UNPACK #-}!Int
!(MVI4 s)
newVI1 :: Int -> Int -> ST s (MVI1 s)
newVI1 n x = do
a <- new n
mapM_ (\i -> MU.unsafeWrite a i x) [0..n-1]
return a
-- Growable array - we always append an element. It grows by factor of 1.5 if more capacity is needed
append :: Snakev s -> (Int,Int,Int,Int) -> ST s (Snakev s)
append (S i v) x = do
if i < MU.length v then MU.unsafeWrite v i x >> return (S (i+1) v)
else MU.unsafeGrow v (floor $! 1.5 * (int2Float $ MU.length v)) >>= (\y -> MU.unsafeWrite y i x >> return (S (i+1) y))
gridWalk :: Vector Word8 -> Vector Word8 -> MVI1 s -> MVI1 s -> Snakev s -> Int -> (Vector Word8 -> Vector Word8 -> Int -> Int -> Int) -> ST s (Snakev s)
gridWalk a b fp snodes snakesv !k cmp = do
let offset = 1+U.length a
xp = offset-k
snodep <- MU.unsafeRead snodes xp -- get the index of previous snake node in snakev array
append snakesv (snodep,xp,xp,xp)
{-#INLINABLE gridWalk #-}
GHC generates a version of append
for use in gridWalk
. That function is $wa
in the core - Please note the boxed Int argument:
$wa
:: forall s.
Int#
-> MVI4 s
-> Int#
-> Int#
-> Int#
-> Int ======= Boxed value - one of (Int,Int,Int,Int) is boxed
-> State# s
-> (# State# s, Snakev s #)
$wa =
\ (@ s)
(ww :: Int#)
(ww1 :: MVI4 s)
(ww2 :: Int#)
(ww3 :: Int#)
(ww4 :: Int#)
(ww5 :: Int) === Boxed value
(w :: State# s) ->
....
....
of ipv12 { __DEFAULT ->
case (writeIntArray# ipv7 ww ww4 (ipv12 `cast` ...)) `cast` ...
of ipv13 { __DEFAULT ->
(# case ww5 of _ { I# x# ->
(writeIntArray# ipv10 ww x# (ipv13 `cast` ...)) `cast` ...
},
S (+# ww 1)
((MV_4
(+# y rb)
==== x below unboxed from arg ww5 ======
((MVector 0 x ipv1) `cast` ...)
((MVector 0 x1 ipv4) `cast` ...)
((MVector 0 x2 ipv7) `cast` ...)
((MVector 0 x3 ipv10) `cast` ...))
`cast` ...) #)
gridWalk
boxes the value when calling append
:
=== function called by gridWalk ======
a :: forall s.
Vector Word8
-> Vector Word8
-> MVI1 s
-> MVI1 s
-> Snakev s
-> Int
-> (Vector Word8 -> Vector Word8 -> Int -> Int -> Int)
-> State# s
-> (# State# s, Snakev s #)
a =
\ (@ s)
(a1 :: Vector Word8)
_
_
(snodes :: MVI1 s)
(snakesv :: Snakev s)
(k :: Int)
_
(eta :: State# s) ->
case k of _ { I# ipv ->
case snodes `cast` ... of _ { MVector rb _ rb2 ->
case a1 `cast` ... of _ { Vector _ rb4 _ ->
let {
y :: Int#
y = -# (+# 1 rb4) ipv } in
case readIntArray# rb2 (+# rb y) (eta `cast` ...)
of _ { (# ipv1, ipv2 #) ->
case snakesv of _ { S ww ww1 ->
====== y boxed below before append called ======
$wa ww ww1 ipv2 y y (I# y) (ipv1 `cast` ...)
}
}
}
}
}
So, the effect seems to be boxing of value in gridWalk
and unboxing in append
before insertion into the vector of (Int,Int,Int,Int)
. Marking append
INLINE
doesn't change the behavior - those boxed values just move in the function body of gridWalk
.
I will appreciate pointers on how to make this value unboxed. I will like to keep the functionality of append
(i.e., handle vector growth when capacity is exceeded) while refactoring it.
GHC
version is 7.6.1
. Vector version is 0.10
.
This is just a comment. I figured I would get rid of the tuple argument (adjusting the use of append
in gridWalk
), but the result is that (only) the last Int argument has to be bang'd to get everything unboxed, which does seem strange:
append :: Snakev s -> Int -> Int -> Int -> Int -> ST s (Snakev s)
append (S i v) a b c !d = do
if i < len then do MU.unsafeWrite v i (a,b,c,d)
return $ S (i+1) v
else do y <- MU.unsafeGrow v additional
MU.unsafeWrite y i (a,b,c,d)
return $ S (i+1) y
where len = MU.length v
additional = floor (1.5 * int2Float len) -- this seems kind of bizarre
-- by the way; can't you stay inside Int?
-- 3 * (len `div` 2) or something
Edit, also, you get everything unboxed if you move the application of S (i+1)
outside the do block, but I'm not sure if that gets us closer to the quarry...:
append :: Snakev s -> Int -> Int -> Int -> Int -> ST s (Snakev s)
append (S i v) a b c d = do
if i < len then liftM (S (i+1)) $ do MU.unsafeWrite v i (a,b,c,d)
return v
else liftM ( S (i+1)) $ do y <- MU.unsafeGrow v zzz
MU.unsafeWrite y i (a,b,c,d)
return y
where len = MU.length v
zzz = floor (1.5 * int2Float len)
But if liftM
is replaced by fmap
we are back to the lone unboxed in. Things go well if liftM (S (1+i)
or fmap (S (i+1)
is moved all the way out to the front:
append (S i v) a b c d = S (i+1) <$> do ...