Search code examples
javascriptnode.jsalgorithmperformancepermutation

Subset of permutations in JavaScript with up to~6 duplicates per element


I have tried to this answer to find all permutations of size K=6 of an array of strings, but the array I'm permuting is way too large (~13,000 elements but I can guarantee most of those will be duplicates), which means I'm getting:

....
re-permuting with 6925/12972
node:internal/console/constructor:257
   value: function(streamSymbol, string) {
                   ^
RangeError: Maximum call stack size exceeded  
    at console.value (node:internal/console/constructor:257:20)  
    at console.log (node:internal/console/constructor:359:26)  
    at permute (/home/path/to/code/permutation.js:22:17)  
    at permute (/home/path/to/code/permutation.js:23:9)  
    .....
re-permuting with 6924/12972
....
re-permuting with 6918/12972

And then it dies. I guessed that is the recursion that's the problem.

I know that there's at most ~300 unique elements in my input (which is how I know that many of the array elements must be duplicates), but I don't know if that's 10,000 instances of one element, and then the rest are individually unique elements or at least K of each unique element. Because of that I can't just fed them into a Set, and there maybe fewer than K of one element so I can't just create a new input by duplicating the new ones K times.

Here is my slightly modified (for readability and logging only) version of the code from the above linked answer (on second look, this algorithm is far from optimal):

let permArr = [];
let usedChars = [];
let originalLength;
/**
 * Subsets of permutations of an array
 * @param {any[]} input array of elements to permute
 * @param {number} subsetLength size of the subsets to return
 * @returns {Array[]} and array of arrays
 */
function permute(input, subsetLength = input.length) {
    let index
    let ch;
    originalLength ??= input.length;
    for (index = 0; index < input.length; index++) {
        ch = input.splice(index, 1)[0];
        usedChars.push(ch);
        if (input.length == 0) {
            let toAdd = usedChars.slice(0, subsetLength);
            // resizing the returned array to size k
            if (!permArr.includes(toAdd)) permArr.push(toAdd);
        }
        console.log(`re-permuting with ${input.length}/${originalLength}`)
        permute(input, subsetLength);
        input.splice(index, 0, ch);
        usedChars.pop();
    }
    return permArr
};

And I found this answer but I do not follow it at all, and this other answer is similar but still uses recursion.

How can I make this without recursion/more performant so it can handle much larger arrays? I'm using NodeJs, and I'm not averse to using different data types.


Solution

  • I don't know if that's 10,000 instances of one element, and then the rest are individually unique elements or at least K of each unique element. Because of that I can't just fed them into a Set, and there maybe fewer than K of one element so I can't just create a new input by duplicating the new ones K times

    So just group and count them. Seems simple enough:

    function subsetPermutations(arr, size) {
        const counts = {};
        for (const el of arr) {
            counts[el] = (counts[el] ?? 0) + 1;
        }
        const unique = Object.keys(counts);
        const result = Array.from({length: size});
        function* recurse(depth) {
            if (depth == size) {
                yield result;
            } else {
                for (const el of unique) {
                    if (counts[el]) {
                        result[depth] = el;
                        counts[el]--;
                        yield* recurse(depth+1)
                        counts[el]++;
                    }
                }
            }
        }
        return recurse(0);
    }
    
    for (const perm of subsetPermutations(["a", "b", "b", "c", "c", "c"], 3)) {
        console.log(perm.join('-'));
    }

    I have tried to this answer to find all permutations of size K=6, but the array I'm permuting is way too large (~13,000 elements), however I can guarantee I know that there's at most ~300 unique

    That's still roughtly 3006 permutations, which is far too many to put them into an array. The function above is designed as an iterator so that you can work on the current result in a loop before it gets mutated in the next iteration, to avoid any allocation overhead, but it still will take too long to generate all of them.

    How can I make this without recursion/more performant so it can handle much larger arrays? I'm using NodeJs, and I'm not averse to using different data types.

    You can use a Map instead of the object for counts, but I doubt it'll be much faster for just 300 different elements.

    Avoiding recursion is unnecessary since it only goes 6 level deep, there won't be a stack overflow unlike with your inefficient solution. But for performance, you might still try this approach of dynamically generating nested loops.