I need parallel (but lazy) version of fmap
for Seq
from Data.Sequence
package. But package doesn't export any Seq
data constructors. So I can't just wrap it in newtype
and implement Functor
directly for the newtype
.
Can I do it without rewriting the whole package?
The best you can do is probably to splitAt
the sequence into chunks, fmap
over each chunk, and then append the pieces back together. Seq
is represented as a finger tree, so its underlying structure isn't particularly well suited to parallel algorithms—if you split it up by its natural structure, successive threads will get larger and larger pieces. If you want to give it a go, you can copy the definition of the FingerTree
type from the Data.Sequence
source, and use unsafeCoerce
to convert between it and a Seq
. You'll probably want to send the first few Deep
nodes to one thread, but then you'll have to think pretty carefully about the rest. Finger trees can be very far from weight-balanced, primarily because 3^n
grows asymptotically faster than 2^n
; you'll need to take that into account to balance work among threads.
There are at least two sensible ways to split up the sequence, assuming you use splitAt
:
Split it all before breaking the computation into threads. If you do this, you should split it from left to right or right to left, because splitting off small pieces is cheaper than splitting off large ones and then splitting those. You should append the results in a similar fashion.
Split it recursively in multiple threads. This might make sense if you want a lot of pieces or more potential laziness. Split the list near the middle and send each piece to a thread for further splitting and processing.
There's another splitting approach that might be nicer, using the machinery currently used to implement zipWith
(see the GitHub ticket I filed requesting chunksOf
), but I don't know that you'd get a huge benefit in this application.
The non-strict behavior you seek seems unlikely to work in general. You can probably make it work in many or most specific cases, but I'm not too optimistic that you'll find a totally general approach.