Search code examples
excelexcel-2021

Count distinct and unique values in pivot table or other way?


I have a contract table, with amendment, type and stats columns. But the table source lists the contract and all amendments as a separate row.

I need to categorize contracts by their type and status, based only on the last stats, i mean, based on the highest amendment.

A pivot table return strange results.

My table sample:

contract    Type    amendment   stats 
contract 4  service 0   active
contract 1  rent    0   expired
contract 3  lending 2   expired
contract 1  rent    2   active
contract 1  rent    1   active
contract 3  lending 1   expired
contract 2  service 2   revoked
contract 4  service 1   active
contract 3  lending 0   expired
contract 2  service 0   expired
contract 3  lending 3   expired
contract 2  service 1   expired
contract 1  rent    3   expired

A pivot data counting distinct contracts:

enter image description here

And when i double clicked in services field:

enter image description here

A contract can't be 2 stats at same time, the last stats (based on highest amendment) is the unique valid to count. "Contract 2" is considered as "revoked" and expired in this pivot table.

How can I do this? I can't use the "remove duplicates" feature because can't delete any rows from the table.


Edit:

The expected output is:

enter image description here

In this image, each contract was counted only once in each type and stats.

It doesn't necessarily have to be a pivot table, if it's possible to do this using only formulas, that's also valid.


Solution

  • This can also be accomplished using Power Query, available in Windows Excel 2010+ and Microsoft 365 (Windows or Mac)

    To use Power Query

    • Select some cell in your Data Table
    • Data => Get&Transform => from Table/Range
    • When the PQ Editor opens: Home => Advanced Editor
    • Make note of the Table Name in Line 2
    • Paste the M Code below in place of what you see
    • Change the Table name in line 2 back to what was generated originally.
    • Read the comments and explore the Applied Steps to understand the algorithm

    After running the Power Query

    • Select Close and Load To
      • PivotTable Report
    • Choose either a new sheet of existing worksheet and a cell
    • Create your pivot table by dragging
      • Type to Rows
      • Status to Columns
      • Status to Values (should default to Count)
    let
    
    //Change next line to reflect data source
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{
            {"contract", type text}, {"Type", type text}, {"amendment", Int64.Type}, {"status", type text}}),
    
    //Group by contract
    //then filter each subgroup to extract the item with the latest amendment
        #"Grouped Rows" = Table.Group(#"Changed Type", {"contract"}, {
            
    //If the latest amendment will always refer only to a single item, Table.Distinct can be removed
           {"Latest", (t)=>Table.Distinct(Table.SelectRows(t, each [amendment] = List.Max(t[amendment]))), 
             type table [contract=nullable text, Type=nullable text, amendment=nullable number, status=nullable text]}}),
    
    
    //Expand the appropriate columns in the table
        #"Expanded Latest" = Table.ExpandTableColumn(#"Grouped Rows", "Latest", {"Type", "status"})
    in
        #"Expanded Latest"
    

    Data
    enter image description here

    Results after Pivoting
    enter image description here

    You could also do the entire process in Power Query, although adding the Row Totals and Column Totals is a bit more complex:

    let
    
    //Change next line to reflect data source
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{
            {"contract", type text}, {"Type", type text}, {"amendment", Int64.Type}, {"status", type text}}),
    
    //Group by contract
    //then filter each subgroup to extract the item with the latest amendment
        #"Grouped Rows" = Table.Group(#"Changed Type", {"contract"}, {
            
    //If the latest amendment will always refer only to a single item, Table.Distinct can be removed
           {"Latest", (t)=>Table.Distinct(Table.SelectRows(t, each [amendment] = List.Max(t[amendment]))), 
             type table [contract=nullable text, Type=nullable text, amendment=nullable number, status=nullable text]}}),
    
    
    //Expand the appropriate columns in the table
        #"Expanded Latest" = Table.ExpandTableColumn(#"Grouped Rows", "Latest", {"Type", "status"}),
    
    //Pivot on Status and aggregate by Count
        #"Pivoted Column" = Table.Pivot(#"Expanded Latest", 
                                List.Distinct(#"Expanded Latest"[status]), "status", "contract", List.Count),
    
    //Add total column and row
        #"Add Row Totals" = Table.AddColumn(#"Pivoted Column","Row Totals", 
            each List.Sum(List.RemoveFirstN(Record.FieldValues(_))), Int64.Type),
    
        #"Add Column Totals" = Table.InsertRows(
            #"Add Row Totals",
            Table.RowCount(#"Add Row Totals"),
            {Record.FromList({"Grand Total"} 
                & List.Accumulate(
                    List.RemoveFirstN(Table.ColumnNames(#"Add Row Totals"),1),
                    {}, 
                    (s,c)=>s & {List.Sum(
                                    Table.Column(#"Add Row Totals",c))}), Table.ColumnNames(#"Add Row Totals"))})
    in
        #"Add Column Totals"
    

    Result from PQ alone
    enter image description here