Search code examples
excelexcel-formulagroup-bycountif

Excel Groupby and Sum Returning Error due to blank rows


I have a dataset with the following structure

email               category
[email protected]     action
[email protected]    horror
[email protected]   romance
[email protected]     sci-fi
[email protected]   action
[email protected]    action
[email protected]
[email protected]    
[email protected]  sci-fi
[email protected]    horror       
[email protected]                 

and am trying to produce a column 'sum_category' that would group by email to get the sum of unique category values for each unique email like this

email               category     sum_category
[email protected]     action       3
[email protected]    horror       1
[email protected]   romance      2
[email protected]     sci-fi       3
[email protected]   action       2
[email protected]    action       1
[email protected]                 1
[email protected]                 1
[email protected]  sci-fi       1
[email protected]    horror       3
[email protected]                 3

Assuming Column A is 'Email' and Column C is 'category', I have tried using the following code but it is producing #DIV/0! error

SUMPRODUCT((A$2:A$100=A2)/COUNTIFS(C$2:C$100,C$2:C$100,A$2:A$100,A$2:A$100))

I have also tried using the following code but it is counting the blank values into the sum

SUMPRODUCT((A$2:A$100=A2)/COUNTIFS(C$2:C$100,C$2:C$100&"",A$2:A$100,A$2:A$100))

Appreciate any help I could get to resolve this, thank you.


Solution

  • I presume you don't have Office 365, else this would be straightforward, for example using the following spill formula:

    =LET(
        α, A2:A12,
        β, C2:C12,
        BYROW(α, LAMBDA(ζ, COUNTA(UNIQUE(FILTER(β, (α = ζ) * (β <> ""))))))
    )
    

    Otherwise, don't use the reciprocal COUNTIF/SUMPRODUCT set-up: it's incredibly slow compared to the following FREQUENCY/MATCH construction, in D2:

    =SUM(
        IF(FREQUENCY(
            IF(A$2:A$12 = A2,
                IF(C$2:C$12 <> "", MATCH(C$2:C$12, C$2:C$12, 0))
                ),
                ROW(A$2:A$12) - MIN(ROW(A$2:A$12)) + 1
            ), 
          1
        )
    )
    

    and then copied down to D12.