We define the following data type Stupid
:
import qualified Data.Vector as V
import Data.List (foldl')
data Stupid = Stupid {content::V.Vector Int, ul::Int} deriving Show
Now I have two slightly different code.
foldl' (\acc x->Stupid{content=(content acc) V.// [(x,x+123)],ul=1}) (Stupid {content=V.replicate 10000 10,ul=1}) $ take 100000 $ cycle [0..9999]
takes constant memory (~100M), while
foldl' (\acc x->Stupid{content=(content acc) V.// [(x,x+123)],ul=ul acc}) (Stupid {content=V.replicate 10000 10,ul=1}) $ take 100000 $ cycle [0..9999]
takes a huge amount of memory(~8G).
Theoretically, only one copy of the current Stupid
object is needed though out the process for both cases. I don't understand why there is such a difference in memory consumption if I want to access and record the ul acc
.
Can someone explain why this happens and give a workaround for constant memory if I need to access ul acc
? Thanks.
Note: I know that I can do replacements of a vector in batch, this script is just for demonstration purpose, so please don't modify that part.
I would try to force the fields of Stupid
and see if that helps.
let f acc x = c `seq` a `seq` Stupid{content=c,ul=a}
where
c = content acc V.// [(x,x+123)]
a = ul acc
in foldl' f (Stupid {content=V.replicate 10000 10,ul=1}) $
take 100000 $
cycle [0..9999]
This should be nearly equivalent to forcing the parameters of the function:
foldl' (\acc x -> acc `seq` x `seq`
Stupid{content=(content acc) V.// [(x,x+123)],ul=ul acc})
(Stupid {content=V.replicate 10000 10,ul=1}) $ take 100000 $ cycle [0..9999]
(This can also be written with bang patterns, if one prefers those.)
Another option, more aggressive, would be to use strictness annotations in the definition of the Stupid
constructor.
data ... = Stupid { content = ! someType , ul :: ! someOtherType }
This will always force those fields in the whole program.