Search code examples
javascriptrecursionramda.jstransducer

Transducer flatten and uniq


I'm wondering if there is a way by using a transducer for flattening a list and filter on unique values?

By chaining, it is very easy:

import {uniq, flattenDeep} from 'lodash';|

const arr = [1, 2, [2, 3], [1, [4, 5]]];

uniq(flattendDeep(arr)); // ->  [1, 2, 3, 4, 5]
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.10/lodash.core.min.js"></script>

But here we loop twice over the list (+ n by depth layer). Not ideal.

What I'm trying to achieve is to use a transducer for this case. I've read Ramda documentation about it https://ramdajs.com/docs/#transduce, but I still can't find a way to write it correctly.

Currently, I use a reduce function with a recursive function inside it:

import {isArray} from 'lodash';

const arr = [1, 2, [2, 3], [1, [4, 5]]];

const flattenDeepUniq = (p, c) => {
    if (isArray(c)) {
        c.forEach(o => p = flattenDeepUniq(p, o));
    }
    else {
        p = !p.includes(c) ? [...p, c] : p;
    }

    return p;
};

arr.reduce(flattenDeepUniq, []) // -> [1, 2, 3, 4, 5]
<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.17.10/lodash.core.min.js"></script>

We have one loop over the elements (+ n loop with deep depth layers) which seems better and more optimized.

Is this even possible to use a transducer and an iterator in this case? For more information about Ramda transduce function: https://gist.github.com/craigdallimore/8b5b9d9e445bfa1e383c569e458c3e26


Solution

  • Transducers don't make much sense here. Your data structure is recursive. The best code to deal with recursive structures usually requires recursive algorithms.

    How transducers work

    (Roman Liutikov wrote a nice introduction to transducers.)

    Transducers are all about replacing multiple trips through the same data with a single one, combining the atomic operations of the steps into a single operation.

    A transducer would be a good fit to turn this code:

    xs.map(x => x * 7).map(x => x + 3).filter(isOdd(x)).take(5)
    //  ^               ^                 ^               ^
    //   \               \                 \               `------ Iteration 4
    //    \               \                 `--------------------- Iteration 3
    //     \               `-------------------------------------- Iteration 2
    //      `----------------------------------------------------- Iteration 1
    

    into something like this:

    xs.reduce((r, x) => r.length >= 5 ? res : isOdd(x * 7 + 3) ? res.concat(x * 7 - 3) : res, [])
    //    ^
    //     `------------------------------------------------------- Just one iteration
    

    In Ramda, because map, filter, and take are transducer-enabled, we can convert

    const foo = pipe(
      map(multiply(7)), 
      map(add(3)), 
      filter(isOdd), 
      take(3)
    )
    
    foo([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) //=> [17, 31, 45]
    

    (which iterates four times through the data) into

    const bar = compose(
      map(multiply(7)), 
      map(add(3)), 
      filter(isOdd), 
      take(3)
    )
    
    into([], bar, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10])  //=> [17, 31, 45]
    

    which only iterates it once. (Note the switch from pipe to compose. Tranducers compose in an order opposite that of plain functions.)

    Note the key point of such transducers is that they all operate similarly. map converts a list to another list, as do filter and take. While you could have transducers that operate on different types, and map and filter might also work on such types polymorphically, they will only work together if you're combining functions which operate on the same type.

    Flatten is a weak fit for transducers

    Your structure is more complex. While we could certainly create a function that will crawl it in in some manner (preorder, postorder), and could thus probably start of a transducer pipeline with it, the logical way to deal with a recursive structure is with a recursive algorithm.

    A simple way to flatten such a nested structure is something like this:

    const flatten = xs => xs.reduce(
      (a, x) => concat(a, isArray(x) ? flatten(x) : [x]), 
      []
    );
    

    (For various technical reasons, Ramda's code is significantly more complex.)

    This recursive version, though, is not well-suited to work with transducers, which essentially have to work step-by-step.

    Uniq poorly suited for transducers

    uniq, on the other hand, makes less sense with such transducers. The problem is that the container used by uniq, if you're going to get any benefit from transducers, has to be one which has quick inserts and quick lookups, a Set or an Object most likely. Let's say we use a Set. Then we have a problem, since our flatten operates on lists.

    A different approach

    Since we can't easily fold existing functions into one that does what you're looking for, we probably need to write a one-off.

    The structure of the earlier solution makes it fairly easy to add the uniqueness constraint. Again, that was:

    const flatten = xs => xs.reduce(
      (a, x) => concat(a, isArray(x) ? flatten(x) : [x]), 
      []
    );
    

    With a helper function for adding all elements to a Set:

    const addAll = (set, xs) => xs.reduce((s, x) => s.add(x), set)
    

    We can write a function that flattens, keeping only the unique values:

    const flattenUniq = xs => xs.reduce(
      (s, x) => addAll(s, isArray(x) ? flattenUniq(x) : [x]), 
      new Set()
    )
    

    Note that this has much the structure of the above, switching only to use a Set and therefore switching from concat to our addAll.

    Of course you might want an array, at the end. We can do that just by wrapping this with a Set -> Array function, like this:

    const flattenUniq = xs => Array.from(xs.reduce(
      (s, x) => addAll(s, isArray(x) ? flattenUniq(x) : [x]), 
      new Set()
    ))
    

    You also might consider keeping this result as a Set. If you really want a collection of unique values, a Set is the logical choice.

    Such a function does not have the elegance of a points-free transduced function, but it works, and the exposed plumbing makes the relationships with the original data structure and with the plain flatten function much more clear.


    I guess you can think of this entire long answer as just a long-winded way of pointing out what user633183 said in the comments: "neither flatten nor uniq are good use cases for transducers."