As an example, here is my Table:
My objective with this data is to sum the Values in the Amount columns which fit specific Project, Section and Date criteria.
As you can see there are 2 sets of Project|Section|Amount, in the final product there will be as many as 30, rendering sumifs quite a pain to write and also to reuse for diferent project and section criteria.
The explanation for this format is the each invoice can be applied to several project/sections. Which is shown in the first row since the 2 amounts combine to be equal to the total in the same row (1000+200=1200)
Expected result:
In my expected result the formula would check every Project-Section-Amount dataset, and for eache ntry would check if the first column is of the expected month, if project and section matches the given values, and add them up.
I have tried googling diferent functions and formulas, a combination of sumproduct with sumifs and indexes or offsets, but have had no success.
Updated the response, based on the latest OP's feedback. Maybe there are shorter ways of doing it (it is not a simple task anyway). In the meantime you can try the following approach:
=LET(in,A1:J5, h, TAKE(in,1), data, DROP(in,1),
SOMs, EOMONTH(1*TAKE(data,,1),-1)+1, uxSOMs,UNIQUE(SOMs),
QRY, LAMBDA(lk, CHOOSECOLS(data, FILTER(SEQUENCE(,COLUMNS(h)), h=lk))),
prjs, QRY("Project"), sects, QRY("Section"), amnts, QRY("Amount"),
n, COLUMNS(prjs), seq,SEQUENCE(n),dates,IF(TOROW(seq),SOMs),
wdata, WRAPROWS(TOROW(CHOOSECOLS(HSTACK(dates, prjs,sects,amnts),
TOROW(HSTACK(seq, n+seq, 2*n+seq, 3*n + seq)))),4),
f, FILTER(wdata, (INDEX(wdata,,2)<>"") + (INDEX(wdata,,3)<>"")),
prjSec, UNIQUE(CHOOSECOLS(f,2,3)),
HCALC, LAMBDA(set,
HSTACK(TAKE(DROP(set,,1),1,2), MMULT(SEQUENCE(,ROWS(set),,0),
IF(TAKE(set,,1)=TOROW(uxSOMs), TAKE(set,,-1),0)))),
REDUCE(HSTACK({"Project","Section"}, TOROW(uxSOMs)),
SEQUENCE(ROWS(prjSec)), LAMBDA(ac,s,
VSTACK(ac, HCALC(FILTER(f, (INDEX(f,,2)=INDEX(prjSec,s,1))
* (INDEX(f,,3)=INDEX(prjSec,s,2))))))))
We have a single name (in
) that depends on the input range, the rest of the variables are deduced from in
: h
, the header, data
, the input data without the header. SOMs
, the start date of the month from the input dates.
To extract project (prjs
), sections (sects
) and amounts (amnts
) columns, we defined a user LAMBDA
function QRY
, to avoid repeating the same calculation for each of them, so we call QRY
with the corresponding parameter to get each of them.
The main goal is to accommodate the input in a way it is easier to do the calculation by a unique combination of project and section and unique months. We plan to use REDUCE/VSTACK
pattern(1). This transformed input data is represented by the name f
. For this sample data this is how f
will look like:
6/1/2023 OCL CA 1000
6/1/2023 OL LAP 200
6/1/2023 OL LAP 2000
6/1/2023 OL LAP 3000
7/1/2023 OL LAP 600
where each column represents: date, project, section, and amount. Having this input data. We can iterate over the unique combination of project and section and summarize the amount for each unique month. We do this via REDUCE/VSTACK
pattern. Where on each iteration it generates for a given combination of projects and sections the totals for unique months, via the user LAMBDA
function: HCALC
.
To get f
, it first generates as many columns as the dataset we have (n
) for dates, repeating the date column, this information is stored in dates
name via: IF(TOROW(seq),SOMs)
, i.e, n
-columns with repeated dates. Then it stacks horizontally the group of dates, projects, sections, and amounts via HSTACK(dates, prjs,sects,amnts)
. To select the columns in a specific order we use the following:
TOROW(HSTACK(seq, n+seq, 2*n+seq, 3*n + seq)
Now we are ready via WRAPROWS
of 4
(date, project, section, and amount), to have the data in the way we wanted (wdata
). Now it just needs to remove rows with no project or section information and that is the FILTER
call to get finally f
. We use this filter to focus the calculation where there is information only.
Having f
, we can then get the unique combination of projects and sections (prjSec
).
The logic for doing the calculation is inside HCALC
. The input set
, is the f
data filtered by one of the unique combinations of prjSec
. We use MMULT
to identify where the dates are equal to the unique start of the months (uxSOM
):
IF(TAKE(set,,1)=TOROW(uxSOMs), TAKE(set,,-1),0)
The previous formula generates on each column the corresponding amount for unique months. We have as many columns as unique months we have. Now it just needs to do the multiplication via MMULT
to sum by column the amounts. It uses a row array of 1
s to do the multiplication: SEQUENCE(,ROWS(set),,0)
. Finally, we use HSTACK
to add horizontally to the MMULT
result, the project and the corresponding section (TAKE(DROP(set,,1),1,2)
).
Finally, we use REDUCE/VSTACK
with the accumulator (ac
) initialized with the header of the output. We iterate over the number of unique pair of combinations of project and section: SEQUENCE(ROWS(prjSec))
, invoking on each iteration HCALC
filtered by the corresponding project and section on each iteration.
Notes:
MMULT
calculation can be replaced with BYCOL
, but usually MMULT
is more efficient.h
).(1): how to transform a table in Excel from vertical to horizontal but with different length