Search code examples
sortinggoogle-sheetscountsumgoogle-sheets-formula

Remove Multiple Rows from Range Satisfying Conditions


I'm pretty new to doing 'advanced' stuff in excel, so I wasn't able to figure this out myself. I'm working with the following Google Sheets spreadsheet:

Text

Basically the algorithm I want to implement for counting elements is as follows:

  • For each element group (I, II, III, IV) find the highest element value.
  • Sum these element values up.
  • Let M be the total number of unique element groups found.
  • Out of all the other remaining element values, removing those M highest values from each individual group, sum over the 7-M highest value encountered, disregarding element groups.

This should work for any M=1,2,3,4 and 7 can be replaced by any number ranging from 5 to 10.

In this case, the highest elements of each individual groups are:

I: 0.3 -- II: 0.1 -- III: 0.3 -- IV: 0.2

We then look at the highest values other than these values, up until we have a total of 7 values, that is, removing these from the list we look at the 3 highest values remaining. These are 0.3 two times and 0.2 one time. The total sum becomes 1.7.

I had already tried to do this taking into account element group IV only, also assuming it only appears once. I'm not sure if it exactly works in this case, but besides that point I'm also trying to extend it to make it work for this algorithm in which all the element groups are taken into account. What I used is

=IFERROR(QUERY({ROW(B2:D11),B2:D11},"select Col1 where Col2='IV' limit 1",0),0) + sum(sortn(FILTER(D2:D11, ROW(D2:D11)<>ArrayFormula(IFERROR(QUERY({ROW(B2:D11),B2:D11},"select Col1 where Col2='IV' limit 1",0),0))), 6, 0, 1, false))

The idea is that I add the value of IV if it exists, and then sum over the top 6 of a sorted list of which the IV-entry value is removed.

Anyone have an idea how to remove multiple rows at the same time, or any other way to make this algorithm work? I haven't got much experience in SQL or things alike, but I'm sure I could make this algorithm work in Python using dataframes or lists.

Edit: I did think of/tried to recursively use FILTER(FILTER...)) etc. 4 times to filter out 1 value from a specific element group, count how many times such an element was found, and created a sorted list of length 7-M to sum over it again, but I didn't get this to work and I'm not sure if this is the way to go.


Solution

  • For each element group (I, II, III, IV) find the highest element value.

    =SORTN(SORT({A2:A, C2:C}, 2, 0), 9^9, 2, 1, 1)
    

    enter image description here

    Sum these element values up.

    =SUM(SORTN(SORT({A2:A, C2:C}, 2, 0), 9^9, 2, 1, 1))
    

    enter image description here

    Let M be the total number of unique element groups found

    =COUNTUNIQUE(A2:A)
    

    enter image description here


    update:

    =SUM(SORTN(FILTER(C2:C, NOT(REGEXMATCH(A2:A&" "&C2:C&" "&
     COUNTIFS(A2:A&C2:C, A2:A&C2:C, ROW(A2:A), "<="&ROW(A2:A)), TEXTJOIN("|", 1, 
     QUERY(TRANSPOSE(SORTN(SORT({A2:A, C2:C}, 2, 0), 9^9, 2, 1, 1)),,
     9^9)&" 1")))), 7-COUNTUNIQUE(A2:A), 0, 1, 0))+
     SUM(SORTN(SORT({A2:A, C2:C}, 2, 0), 9^9, 2, 1, 1))