Search code examples
excelarray-formulas

Excel multilevel array formula with partial string matches to sum resultant cells


I've been trying to sort this for over a day now without much luck. I have successfully used SUMIFS, INDEX, MATCH, COUNTIF, "--" etc array functions previously and am not a novice, but also not an expert on these. I can't seem to weave these together correctly, and likely on an altogether incorrect path.

Basically, I am trying to aggregate data from multiple spreadsheets, requiring a mapping of various items (rows) into a canonical form for summing.

The image here shows a representative, but simplified version of my quest. Each "region" on this example spreadsheet (Final..., Mapping, DataSet1, DataSet2) is actually in different spreadsheets, and there are several sheets with 50-150 rows in each xlsx.

The image here shows a representative, but simplified version of the problem I am trying to solve. As mentioned, each "region" (Final..., Mapping, DataSet1, DataSet2) is actually in different spreadsheets, and there are several sheets in each xlsx.

Note that the names in Column B are quite arbitrary (meaning not all P1's have an 'x' pattern, like shown here as x1, x2, etc. Do not rely on any pattern in the names, except the x, y , z in the Mapping table are substrings (case insensitive, trailing match) of the names in Column B in the DataSets.

And in the image, the Final Result Table (summed manually) is what I want to compute via(an array) formula: A single formula would be ideal (given I have many spreadsheets from which the monthly data is being pulled from, so I can't readily modify but can create an interim spreadsheet if required, so open to helper columns or helper rows).

Here's the process - For each name (B3-B5) in the Final Result Table, I want to sum the name from it's components as follows:

  1. Lookup all the matches in the Mapping Table (so for P1, the formula =IF($C$10:$C$15=$B3, $B$10:$B$15,"") gives {"x1";"";"";"x2";"";"x3"}.
  2. I then want to search each of x1, x2, and x3 in B19:B26 to get rows 21, 22, 24, 25, 26 in DataSet1 and B31:B35 to get row 32 in DataSet2, to then add up the Jan totals into C3. (Effectively, C3=C21+C22+C24+C25+C26+C32). Same for P2 and P3, and thru Feb, Mar, ...

I am stuck on how to remove blank or 0 or Div0 or such "error rows" from the interim result in 2, and also need to use 2 arrays of different sizes (3 valid rows in example 2 above, ignoring blanks) to search many rows in DataSets. I tried SEARCH("*"&IF($C$10:$C$15=$B3, $B$10:$B$15,""), $B$19:$B$26) but get unexpected results. I have tried to replace text in the interim result {"x1";"";"";"x2";"";"x3"} with TRUE/FALSE, and 1/0, etc. to help with INDEX or MATCH, but am stymied by errors in downstream ("surrounding") formulas.

Thanks in advance.


Solution

  • Here is a solution without resorting to nasty (imo) CSE formulas.

    =  SUMPRODUCT($C$19:$F$26*(COUNTIFS($B$10:$B$15, RIGHT($B$19:$B$26,2),$C$10:$C$15,$B3)>0)*($C$18:$F$18=C$2))
       +
       SUMPRODUCT($C$31:$F$35*(COUNTIFS($B$10:$B$15, RIGHT($B$31:$B$35,2),$C$10:$C$15,$B3)>0)*($C$30:$F$30=C$2))
    

    There is one SUMPRODUCT for each data set. If possible, it would be better to put all your data sets into a single table with a column identify which data set it is a part of.

    enter image description here

    The way it works is to takes each values in your data set and multiplies it by whether the 2 right most character appear in your mapping table for that P code, multiplied by whether the value is in the correct month. So it returns 0 if either of those conditions are false. Then returns the sum.

    UPDATE IN RESPONSE TO OP COMMENTS

    If, the X,Y, Z codes are not always 2 digits but the first part is ALWAYS 8 digits, you can easily amend the:

    RIGHT($B$19:$B$26,2)
    

    to be:

    RIGHT($B$19:$B$26,LEN($B$19:$B$26)-8)
    

    Making the formula for the first data set:

    =SUMPRODUCT($C$19:$F$26*(COUNTIFS($B$10:$B$15, RIGHT($B$19:$B$26,LEN($B$19:$B$26)-8),$C$10:$C$15,$B3)>0)*($C$18:$F$18=C$2))
    

    And you can amend for other data sets and simply add them together.