Search code examples
clojurepartitioning

Clojure partition by filter


In Scala, the partition method splits a sequence into two separate sequences -- those for which the predicate is true and those for which it is false:

scala> List(1, 5, 2, 4, 6, 3, 7, 9, 0, 8).partition(_ % 2 == 0)
res1: (List[Int], List[Int]) = (List(2, 4, 6, 0, 8),List(1, 5, 3, 7, 9))

Note that the Scala implementation only traverses the sequence once.

In Clojure the partition-by function splits the sequence into multiple sub-sequences, each the longest subset that either does or does not meet the predicate:

user=> (partition-by #(= 0 (rem % 2)) [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
((1 5) (2 4 6) (3 7 9) (0 8))

while the split-by produces:

user=> (split-with #(= 0 (rem % 2)) [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
[() (1 5 2 4 6 3 7 9 0 8)]

Is there a built-in Clojure function that does the same thing as the Scala partition method?


Solution

  • I believe the function you are looking for is clojure.core/group-by. It returns a map of keys to lists of items in the original sequence for which the grouping function returns that key. If you use a true/false producing predicate, you will get the split that you are looking for.

    user=> (group-by even? [1, 5, 2, 4, 6, 3, 7, 9, 0, 8])
    {false [1 5 3 7 9], true [2 4 6 0 8]}
    

    If you take a look at the implementation, it fulfills your requirement that it only use one pass. Plus, it uses transients under the hood so it should be faster than the other solutions posted thus far. One caveat is that you should be sure of the keys that your grouping function is producing. If it produces nil instead of false, then your map will list failing items under the nil key. If your grouping function produces non-nil values instead of true, then you could have passing values listed under multiple keys. Not a big problem, just be aware that you need to use a true/false producing predicate for your grouping function.

    The nice thing about group-by is that it is more general than just splitting a sequence into passing and failing items. You can easily use this function to group your sequence into as many categories as you need. Very useful and flexible. That is probably why group-by is in clojure.core instead of separate.