Search code examples
haskellfilterrepa

How to filter by predicate on index in Repa


I have two Repa arrays a1 and a2 and I would like to eliminate all the elements in a2 for which the corresponding index in a1 is above a certain threshold. For example:

import qualified Data.Array.Repa as R -- for Repa
import Data.Array.Repa (Z (..), (:.)(..))


a1 = R.fromFunction (Z :. 4) $ \(Z :. x) -> [8, 15, 9, 14] ! x
a2 = R.fromFunction (Z :. 4) $ \(Z :. x) -> [0, 1, 2, 3] ! x
threshold = 10
desired = R.fromFunction (Z :. 2) $ \(Z :. x) -> [0, 2] ! x
-- 15 and 14 are above the threshold, 10

One way to do this is with selectP but I would like to avoid using this, since it computes the arrays, and I would like my arrays to remain in delayed form, if possible.

Another way is with the repa-array, but stack solver does not seem to know how to import this library with resolver nightly-2017-04-10.


Solution

  • One way to look at this issue is that, in order to create a Repa Array, you need to know the size (extent) of the Array upon creation (eg. fromFunction), but, in case of filter operation, there is no way to know the size of the resulting Array in repa without applying a thresholding predicate, essentially computing values of the resulting Array.

    Another way to look at it is, Delayed array is a simple function from an index to a value, which is fine for most operations. For filtering though, when you apply a predicate, in order to find a value at a particular index, you now need to know all values that come before that index in the resulting array, cause for any location, a value may be there, maybe not.

    vector package solves this issue elegantly with stream fusion, and repa-array, next version of Repa, which is still in experimental stage, seems to be trying to use a similar approach, except with extention to higher dimensions (I might be wrong, haven't looked too closely).

    So, short answer, there is no way to do filtering with Repa style functional fusion. Either:

    • stick to selectP - faster (probably), but less memory efficient (for sure), or
    • piggy back onto ifilter from vector package for sequential filtering