I would like to write a function that takes as input an array
with m
rows and n
columns and outputs an array of q
rows and n+1
columns, where q
is the number of unique row-wise combinations in the input array
, and the additional column is integers representing the count of each unique combination. Example image:
My attempted solution is:
UNIQUECOUNT = LAMBDA(array,
LET(
u, UNIQUE(array),
aj, BYROW(array,
LAMBDA(row, TEXTJOIN("\", FALSE, row))),
uj, BYROW(u, LAMBDA(row, TEXTJOIN("\", FALSE, row))),
c, COUNTIF(aj, uj),
HSTACK(u, c)
)
);
Where:
u
is each unique row (q
×n
)
aj
is a 'joined' version of the input array (m
×1
)
uj
is the 'joined' version of each unique row (q
×1
)
c
is the count of each unique row (q
×1
)
Assume I have guaranteed that "" does not exist in any of my data.
The problem:
This formula returns an m
×1
array of #VALUE!
errors. However, if these steps are broken into multiple cells that refer to the previous step by spilled cell address (ex. D2#
), the result works as expected.
In my mind, this means there's some kind of 'type' error going on, where the function works when given a range but not an array?
Minimal example:
With this in mind, I tried to set up a minimal test to replicate the problem. In this example, I'm only trying to show the counts of an input array with a single column. I have written 4 different ways of defining a
which should perform the same, but do not:
TEST = LAMBDA(array,
LET(
a, array,
// a, BYROW(array, LAMBDA(row, row)),
// a, MAP(array, LAMBDA(val, val)),
// a, REDUCEMAP(array, LAMBDA(val, val)),
u, UNIQUE(a),
COUNTIF(a, u)
)
);
m
×1
#VALUE!
errorsMAP
instead of BYROW
, but the same errors occurREDUCE
to make it function like MAP
by using VSTACK
, but the same errors occur(function below for reference)
REDUCEMAP = LAMBDA(array, func,
DROP(
REDUCE(0, array, LAMBDA(acc, next, VSTACK(acc, func(next)))), 1
)
);
Other answers suggested that the implicit intersection operator @
can resolve issues, but I've had no luck changing any of the above behavior using it.
So, my practical question is: How can I write a LAMBDA function that fits the description given at the top of the post?
And my theoretical question is: How can I understand why my logic above works when performed cell-by-cell but not when put into LET variables? Is range
vs array
a useful distinction, and if so are there ways to mitigate these array
problems without needing to dump them into a cell to turn into a range
?
This solution will ALWAYS deliver an n+1 column solution:
=LET(
d, A2:B9,
ad, BYROW(d, LAMBDA(b, CONCAT(b))),
ud, UNIQUE(d),
uad, UNIQUE(ad),
HSTACK(ud, MAP(uad, LAMBDA(m, COUNTA(FILTER(ad, m = ad)))))
)