Search code examples
powerquerym

Is there an PowerQuery function or formula to remove (duplicate) rows when particular conditions are met


I'm trying to transform a financial dataset into a more appropriate form. As the dataset has a lot of rows that are almost identical and need to be removed if the sum of column "value" is equal to zero.

Category Number Supplier Value Cost center Description
Application to order 1001 150 + 121 100 Text A
Application to order 1001 150 - 121 100 Text A
Application to order 1002 142 + 115 120 Text B
Application to order 1002 142 - 89 120 Text B
Order 1003 88 + 276 160 Text C
Order 1003 88 - 272 160 Text C
Order 1004 14 + 890 180 Text D
Order 1004 14 - 890 180 Text D

I would like to remove the numbers 1001 and 1004 as the sum of the value is equal to zero.

As I am just starting with PowerQuery my skills are underdeveloped. First thing I tried was to remove duplicate rows, however as the rows are not identical it did not worked out very well. Use Group By on column Category, Supplier and Cost Center. Was almost what I wanted, except only the selected columns where visible and not all columns of the dataset.

The desired output needs to be as follows:

Category Number Supplier Value Cost center Description
Application to order 1002 142 + 115 120 Text B
Application to order 1002 142 - 89 120 Text B
Order 1003 88 + 276 160 Text C
Order 1003 88 - 272 160 Text C

Solution

  • You almost have it.

    Group. Filter. Then expand

    Group: enter image description here

    Filter using drop down atop column Sum to uncheck zero:

    enter image description here

    Expand using arrow atop the table column:

    enter image description here

    let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Number", Int64.Type}, {"Supplier", Int64.Type}, {"Value", Int64.Type}, {"Cost center", Int64.Type}, {"Description", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Category", "Supplier", "Cost center"}, {{"data", each _, type table}, {"Sum", each List.Sum([Value]), type nullable number}}),
    #"Filtered Rows1" = Table.SelectRows(#"Grouped Rows", each ([Sum] <> 0)),
    #"Expanded data" = Table.ExpandTableColumn(#"Filtered Rows1", "data", {"Number", "Value", "Description"}, {"Number", "Value", "Description"})
    in   #"Expanded data"
    

    enter image description here