Search code examples
smlfold

SML map a filter?


If I have this code:

fun coord_select (x : int, cs : (int*int) list) =
         List.filter (fn (first, _) => first = x ) cs

testing with input gives this:

coord_select (2, [(2,2),(2,3),(3,3),(4,3)])
: val it = [(2,2),(2,3)] : (int * int) list

Now, what if I don't give the desired first coordinate as an int but as a list of several required first coordinates such as [3,4], i.e., I want all coordinate tuples that start with 3 as well as 4? The easy way would simply to create a recursive wrapper around this that went through the list and plugged in the value as coord_select's first variable. But I would like to understand nested things better than such brute force. So I came up with this:

fun coord_match (fs : int list, cs :(int*int) list) =
         map (coord_select (f, cs)) fs

but this can't really work because, as was pointed out, coord_select in the map is actually trying to return a list -- and how does map know to plug in the members of fs into f in the first place? Common Lisp does have a device to keep functions like this from running, i.e., the ' operator. But this wouldn't help, again, because map doesn't know which variable fs is supplying. For input, e.g., I have these coordinates:

[(2,2),(2,3),(3,3),(4,3)]

and I have this list of x-coordinates to match against the above list

[3,4]

Again, I could just put a recursive wrapper around this, but I'm reaching for a more elegant nested solution from the greater fold family.


Solution

  • what if I don't give the desired first coordinate as an int but as a list of several required first coordinates such as [3,4], i.e., I want all coordinate tuples that start with 3 and 4

    It sounds like you want all coordinate tuples that start with 3 or 4, since a coordinate can't both be 3 and 4.

    Given that, you can write coord_select like:

    fun member (x, xs) =
        List.exists (fn x2 => x = x2) xs
    
    fun coord_select (xs, coords) =
        List.filter (fn (x, _) => member (x, xs)) coords
    

    the greater fold family

    This family is called catamorphisms, of which map, filter, exists and foldl belong. Since foldl is the most general of these, it is technically possible to write the code above using folds entirely:

    fun coord_select (xs, coords) =
        foldr (fn ((x, y), acc1) =>
          if foldl (fn (x2, acc2) => acc2 orelse x = x2) false xs
          then (x, y) :: acc1
          else acc1) [] coords
    

    but as should be evident, explicit folds are not very readable.

    If there's a specialised combinator that does a job, you would rather want that over a fold. And if it doesn't exist, creating it from slightly less specialised combinators improves readability. Folding is as close to manual recursion as you get and so provides little information to the reader about what kind of recursion we're attempting.

    For that reason I also made member from exists, since exists requires me to specify a predicate and my predicate is "equality with x"; so even exists, I feel, adds clutter to the coord_select function.

    You can learn more about list catamorphisms in functional programming by reading Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire (1991) by Meijer, Fokkinga, Paterson.