Search code examples
excelpowerbipowerquerypowerbi-desktopm

Remove duplicate elements in column1 based on condition in column3


I have the table below for which I want to remove the duplicated values in first column (HEADER1).

Column3 could be empty or have one or more elements separated by 3 consecutive double points :::.

Then, the second condition would be to look in column3 (HEADER 3). compare between the other rows and if there are rows that have values in column3, select the row that has less values separated by :::

This only applies when there is a comparison between repeated values in column1. For example, the value ap3 in column1 appears 3 times, but the row to select would be the row with value 226 in column3 (only one value).

This is the source table to the left and expected output to the right.

enter image description here

This is my current output in power bi, that is selecting the first value that is repeated, and removes the others.

enter image description here

This is my current code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("lY/NCgIhFIVfRdyOBf6MV93NO7QIpllMaZugZJSBevquI8EUtWgjB/y+47Hv6RglZRSsxXP0fgopEY5ZCO2ca0FyOjDEzpeCSb7CxOHazVuyvz8YSXkKIZPueEKg0bLIANJZKxY/Ro8Xiq99zWrBLt9SxPxeYlpV1a8Lsf6VbV37my0Pg0Lj40//b4KlpjFGtBuB64Yn", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [HEADER1 = _t, HEADER2 = _t, HEADER3 = _t, HEADER4 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"HEADER1", type text}, {"HEADER2", type text}, {"HEADER3", type text}, {"HEADER4", type text}}),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type", {"HEADER1"})
in
    #"Removed Duplicates"

Solution

  • enter image description here

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"HEADER1", type text}, {"HEADER2", Int64.Type}, {"HEADER3", type text}, {"HEADER4", type text}}),
        #"Added Custom" = Table.AddColumn(#"Changed Type", "Sort", each Text.Length( Text.Select([HEADER3], ":"))),
        #"Sorted Rows" = Table.Buffer( Table.Sort(#"Added Custom",{{"HEADER1", Order.Ascending}, {"Sort", Order.Ascending}})),
        #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"HEADER1"}),
        #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates",{"Sort"})
    in
        #"Removed Columns"