I have two Repa arrays a1
and a2
and I would like to eliminate all the elements in a2
for which the corresponding index in a1
is above a certain threshold. For example:
import qualified Data.Array.Repa as R -- for Repa
import Data.Array.Repa (Z (..), (:.)(..))
a1 = R.fromFunction (Z :. 4) $ \(Z :. x) -> [8, 15, 9, 14] ! x
a2 = R.fromFunction (Z :. 4) $ \(Z :. x) -> [0, 1, 2, 3] ! x
threshold = 10
desired = R.fromFunction (Z :. 2) $ \(Z :. x) -> [0, 2] ! x
-- 15 and 14 are above the threshold, 10
One way to do this is with selectP
but I would like to avoid using this, since it computes the arrays, and I would like my arrays to remain in delayed form, if possible.
Another way is with the repa-array
, but stack solver
does not seem to know how to import this library with resolver nightly-2017-04-10
.
One way to look at this issue is that, in order to create a Repa Array, you need to know the size (extent
) of the Array upon creation (eg. fromFunction
), but, in case of filter operation, there is no way to know the size of the resulting Array in repa without applying a thresholding predicate, essentially computing values of the resulting Array.
Another way to look at it is, D
elayed array is a simple function from an index to a value, which is fine for most operations. For filtering though, when you apply a predicate, in order to find a value at a particular index, you now need to know all values that come before that index in the resulting array, cause for any location, a value may be there, maybe not.
vector package solves this issue elegantly with stream fusion, and repa-array, next version of Repa, which is still in experimental stage, seems to be trying to use a similar approach, except with extention to higher dimensions (I might be wrong, haven't looked too closely).
So, short answer, there is no way to do filtering with Repa style functional fusion. Either:
selectP
- faster (probably), but less memory efficient (for sure), orifilter
from vector
package for sequential
filtering