Search code examples
excelmultiple-columns

Excel - I have several tables adjacent to each other and would like to SUMIFS for all tables at the same time


As an example, here is my Table: Input data

My objective with this data is to sum the Values in the Amount columns which fit specific Project, Section and Date criteria.

As you can see there are 2 sets of Project|Section|Amount, in the final product there will be as many as 30, rendering sumifs quite a pain to write and also to reuse for diferent project and section criteria.

The explanation for this format is the each invoice can be applied to several project/sections. Which is shown in the first row since the 2 amounts combine to be equal to the total in the same row (1000+200=1200)

Expected result:

Expected output

In my expected result the formula would check every Project-Section-Amount dataset, and for eache ntry would check if the first column is of the expected month, if project and section matches the given values, and add them up.

I have tried googling diferent functions and formulas, a combination of sumproduct with sumifs and indexes or offsets, but have had no success.


Solution

  • Updated the response, based on the latest OP's feedback. Maybe there are shorter ways of doing it (it is not a simple task anyway). In the meantime you can try the following approach:

    =LET(in,A1:J5, h, TAKE(in,1), data, DROP(in,1),
     SOMs, EOMONTH(1*TAKE(data,,1),-1)+1, uxSOMs,UNIQUE(SOMs),
     QRY, LAMBDA(lk, CHOOSECOLS(data, FILTER(SEQUENCE(,COLUMNS(h)), h=lk))),
     prjs, QRY("Project"), sects, QRY("Section"), amnts, QRY("Amount"),
     n, COLUMNS(prjs), seq,SEQUENCE(n),dates,IF(TOROW(seq),SOMs),
     wdata, WRAPROWS(TOROW(CHOOSECOLS(HSTACK(dates, prjs,sects,amnts),
      TOROW(HSTACK(seq, n+seq, 2*n+seq, 3*n + seq)))),4),
     f, FILTER(wdata, (INDEX(wdata,,2)<>"") + (INDEX(wdata,,3)<>"")),
     prjSec, UNIQUE(CHOOSECOLS(f,2,3)),
     HCALC, LAMBDA(set,
      HSTACK(TAKE(DROP(set,,1),1,2), MMULT(SEQUENCE(,ROWS(set),,0),
        IF(TAKE(set,,1)=TOROW(uxSOMs), TAKE(set,,-1),0)))),
     REDUCE(HSTACK({"Project","Section"}, TOROW(uxSOMs)),
      SEQUENCE(ROWS(prjSec)), LAMBDA(ac,s,
      VSTACK(ac, HCALC(FILTER(f, (INDEX(f,,2)=INDEX(prjSec,s,1)) 
        * (INDEX(f,,3)=INDEX(prjSec,s,2))))))))
    

    Here is the output: output

    We have a single name (in) that depends on the input range, the rest of the variables are deduced from in: h, the header, data, the input data without the header. SOMs, the start date of the month from the input dates.

    To extract project (prjs), sections (sects) and amounts (amnts) columns, we defined a user LAMBDA function QRY, to avoid repeating the same calculation for each of them, so we call QRY with the corresponding parameter to get each of them.

    The main goal is to accommodate the input in a way it is easier to do the calculation by a unique combination of project and section and unique months. We plan to use REDUCE/VSTACK pattern(1). This transformed input data is represented by the name f. For this sample data this is how f will look like:

    6/1/2023    OCL CA  1000
    6/1/2023    OL  LAP 200
    6/1/2023    OL  LAP 2000
    6/1/2023    OL  LAP 3000
    7/1/2023    OL  LAP 600
    

    where each column represents: date, project, section, and amount. Having this input data. We can iterate over the unique combination of project and section and summarize the amount for each unique month. We do this via REDUCE/VSTACK pattern. Where on each iteration it generates for a given combination of projects and sections the totals for unique months, via the user LAMBDA function: HCALC.

    To get f, it first generates as many columns as the dataset we have (n) for dates, repeating the date column, this information is stored in dates name via: IF(TOROW(seq),SOMs), i.e, n-columns with repeated dates. Then it stacks horizontally the group of dates, projects, sections, and amounts via HSTACK(dates, prjs,sects,amnts). To select the columns in a specific order we use the following:

    TOROW(HSTACK(seq, n+seq, 2*n+seq, 3*n + seq) 
    

    Now we are ready via WRAPROWS of 4 (date, project, section, and amount), to have the data in the way we wanted (wdata). Now it just needs to remove rows with no project or section information and that is the FILTER call to get finally f. We use this filter to focus the calculation where there is information only.

    Having f, we can then get the unique combination of projects and sections (prjSec).

    The logic for doing the calculation is inside HCALC. The input set, is the f data filtered by one of the unique combinations of prjSec. We use MMULT to identify where the dates are equal to the unique start of the months (uxSOM):

    IF(TAKE(set,,1)=TOROW(uxSOMs), TAKE(set,,-1),0)
    

    The previous formula generates on each column the corresponding amount for unique months. We have as many columns as unique months we have. Now it just needs to do the multiplication via MMULT to sum by column the amounts. It uses a row array of 1s to do the multiplication: SEQUENCE(,ROWS(set),,0). Finally, we use HSTACK to add horizontally to the MMULT result, the project and the corresponding section (TAKE(DROP(set,,1),1,2)).

    Finally, we use REDUCE/VSTACK with the accumulator (ac) initialized with the header of the output. We iterate over the number of unique pair of combinations of project and section: SEQUENCE(ROWS(prjSec)), invoking on each iteration HCALC filtered by the corresponding project and section on each iteration.

    Notes:

    1. The MMULT calculation can be replaced with BYCOL, but usually MMULT is more efficient.
    2. The previous formula doesn't depend on a specific position for the key columns, it finds such columns looking in the header (h).

    (1): how to transform a table in Excel from vertical to horizontal but with different length