Search code examples
collectionsfunctional-programmingprogramming-languagesfunctorlanguage-design

Treating single and multiple elements the same way ("transparent" map operator)


I'm working on a programming language that is supposed to be easy, intuitive, and succinct (yeah, I know, I'm the first person to ever come up with that goal ;-) ). One of the features that I am considering for simplifying the use of container types is to make the methods of the container's element type available on the container type itself, basically as a shortcut for invoking a map(...) method. The idea is that working with many elements should not be different from working with a single element: I can apply add(5) to a single number or to a whole list of numbers, and I shouldn't have to write slightly different code for the "one" versus the "many" scenario.

For example (Java pseudo-code):

import static java.math.BigInteger.*; // ZERO, ONE, ...
...
// NOTE: BigInteger has an add(BigInteger) method
Stream<BigInteger> numbers = Stream.of(ZERO, ONE, TWO, TEN);
Stream<BigInteger> one2Three11 = numbers.add(ONE); // = 1, 2, 3, 11
// this would be equivalent to:  numbers.map(ONE::add)

As far as I can tell, the concept would not only apply to "container" types (streams, lists, sets...), but more generally to all functor-like types that have a map method (e.g., optionals, state monads, etc.). The implementation approach would probably be more along the lines of syntactic sugar offered by the compiler rather than by manipulating the actual types (Stream<BigInteger> obviously does not extend BigInteger, and even if it did the "map-add" method would have to return a Stream<BigInteger> instead of an Integer, which would be incompatible with most languages' inheritance rules).

I have two questions regarding such a proposed feature:

(1) What are the known caveats with offering such a feature? Method name collisions between the container type and the element type are one problem that comes to mind (e.g., when I call add on a List<BigInteger> do I want to add an element to the list or do I want to add a number to all elements of the list? The argument type should clarify this, but it's something that could get tricky)

(2) Are there any existing languages that offer such a feature, and if so, how is this implemented under the hood? I did some research, and while pretty much every modern language has something like a map operator, I could not find any languages where the one-versus-many distinction would be completely transparent (which leads me to believe that there is some technical difficulty that I'm overlooking here)

NOTE: I am looking at this in a purely functional context that does not support mutable data (not sure if that matters for answering these questions)


Solution

  • Do you come from an object-oriented background? That's my guess because you're thinking of map as a method belonging to each different "type" as opposed to thinking about various things that are of the type functor.

    Compare how TypeScript would handle this if map were a property of each individual functor:

    declare someOption: Option<number>
    someOption.map(val => val * 2) // Option<number>
    
    declare someEither: Either<string, number>
    someEither.map(val => val * 2) // Either<string,number>
    someEither.mapLeft(string => 'ERROR') // Either<'ERROR', number>
    

    You could also create a constant representing each individual functor instance (option, array, identity, either, async/Promise/Task, etc.), where these constants have map as a method. Then have a standalone map method that takes one of those "functor constant"s, the mapping function, and the starting value, and returns the new wrapped value:

    const option: Functor = {
      map: <A, B>(f: (a:A) => B) => (o:Option<A>) => Option<B>
    }
    declare const someOption: Option<number>
    map(option)(val => val * 2)(someOption) // Option<number>
    
    declare const either: Functor = {
      map: <E, A, B>(f: (a:A) => B) => (e:Either<E, A>) => Either<E, B>
    }
    declare const either: Either<string,number>
    map(either)(val => val * 2)(someEither)
    

    Essentially, you have a functor "map" that uses the first parameter to identify which type you're going to be mapping, and then you pass in the data and the mapping function.

    However, with proper functional languages like Haskell, you don't have to pass in that "functor constant" because the language will apply it for you. Haskell does this. I'm not fluent enough in Haskell to write you the examples, unfortunately. But that's a really nice benefit that means even less boilerplate. It also allows you to write a lot of your code in what is "point free" style, so refactoring becomes much easier if you make your language so you don't have to manually specify the type being used in order to take advantage of map/chain/bind/etc.

    Consider you initially write your code that makes a bunch of API calls over HTTP. So you use a hypothetical async monad. If your language is smart enough to know which type is being used, you could have some code like

    import { map as asyncMap }
    
    declare const apiCall: Async<number>
    asyncMap(n => n*2)(apiCall) // Async<number>
    

    Now you change your API so it's reading a file and you make it synchronous instead:

    import { map as syncMap }
    
    declare const apiCall: Sync<number>
    syncMap(n => n*2)(apiCall)
    

    Look how you have to change multiple pieces of the code. Now imagine you have hundreds of files and tens of thousands of lines of code.

    With a point-free style, you could do

    import { map } from 'functor'
    declare const  apiCall: Async<number>
    map(n => n*2)(apiCall)
    

    and refactor to

    import { map } from 'functor'
    declare const  apiCall: Sync<number>
    map(n => n*2)(apiCall)
    

    If you had a centralized location of your API calls, that would be the only place you're changing anything. Everything else is smart enough to recognize which functor you're using and apply map correctly.

    1. As far as your concerns about name collisions, that's a concern that will exist no matter your language or design. But in functional programming, add would be a combinator that is your mapping function passed into your fmap (Haskell term) / map(lots of imperative/OO languages' term). The function you use to add a new element to the tail end of an array/list might be called snoc ("cons" from "construct" spelled backwards, where cons prepends an element to your array; snoc appends). You could also call it push or append.

    2. As far as your one-vs-many issue, these are not the same type. One is a list/array type, and the other is an identity type. The underlying code treating them would be different as they are different functors (one contains a single element, while one contains multiple elements.

    I suppose you could create a language that disallows single elements by automatically wrapping them as a single-element lists and then just uses the list map. But this seems like a lot of work to make two things that are very different look the same.

    Instead, the approach where you wrap single elements to be identity and multiple elements to be a list/array, and then array and identity have their own under-the-hood handlers for the functor method map probably would be better.