Search code examples
excelexcel-formulaoffice365

In Excel 365, how can I get unique counts of a multi-column array?


I would like to write a function that takes as input an array with m rows and n columns and outputs an array of q rows and n+1 columns, where q is the number of unique row-wise combinations in the input array, and the additional column is integers representing the count of each unique combination. Example image:

target input/output

My attempted solution is:

UNIQUECOUNT = LAMBDA(array,
    LET(
        u, UNIQUE(array),
        aj, BYROW(array,
            LAMBDA(row, TEXTJOIN("\", FALSE, row))),
        uj, BYROW(u, LAMBDA(row, TEXTJOIN("\", FALSE, row))),
        c, COUNTIF(aj, uj),
        HSTACK(u, c)
    )
);

Where:

u is each unique row (q×n)

aj is a 'joined' version of the input array (m×1)

uj is the 'joined' version of each unique row (q×1)

c is the count of each unique row (q×1)

Assume I have guaranteed that "" does not exist in any of my data.

The problem:

This formula returns an m×1 array of #VALUE! errors. However, if these steps are broken into multiple cells that refer to the previous step by spilled cell address (ex. D2#), the result works as expected.

working cell by cell

In my mind, this means there's some kind of 'type' error going on, where the function works when given a range but not an array?

Minimal example:

With this in mind, I tried to set up a minimal test to replicate the problem. In this example, I'm only trying to show the counts of an input array with a single column. I have written 4 different ways of defining a which should perform the same, but do not:

TEST = LAMBDA(array,
    LET(
        a, array,
        // a, BYROW(array, LAMBDA(row, row)),
        // a, MAP(array, LAMBDA(val, val)),
        // a, REDUCEMAP(array, LAMBDA(val, val)),

        u, UNIQUE(a),
        COUNTIF(a, u)
    )
);
  1. Base case, performs as expected
  2. Similar to but simpler than my original function, gives m×1 #VALUE! errors
  3. Inspired by answers on SO, I tried to use MAP instead of BYROW, but the same errors occur
  4. Following other answers, I wrote a wrapper around REDUCE to make it function like MAP by using VSTACK, but the same errors occur

(function below for reference)

REDUCEMAP = LAMBDA(array, func,
    DROP(
        REDUCE(0, array, LAMBDA(acc, next, VSTACK(acc, func(next)))), 1
    )
);

Other answers suggested that the implicit intersection operator @ can resolve issues, but I've had no luck changing any of the above behavior using it.

So, my practical question is: How can I write a LAMBDA function that fits the description given at the top of the post?

And my theoretical question is: How can I understand why my logic above works when performed cell-by-cell but not when put into LET variables? Is range vs array a useful distinction, and if so are there ways to mitigate these array problems without needing to dump them into a cell to turn into a range?


Solution

  • This solution will ALWAYS deliver an n+1 column solution:

    =LET(
        d, A2:B9,
        ad, BYROW(d, LAMBDA(b, CONCAT(b))),
        ud, UNIQUE(d),
        uad, UNIQUE(ad),
        HSTACK(ud, MAP(uad, LAMBDA(m, COUNTA(FILTER(ad, m = ad)))))
    )
    

    enter image description here