Search code examples
exceltypescriptm

Select and replace duplicates


I'm trying to achieve this with either Power Query or Typescript.

A column is populated with many values that are duplicated.

Row No. Date
1 01.01.2023
2 01.01.2023
3 01.01.2023
4 02.01.2023
5 01.01.2023
6 03.01.2023
7 02.01.2023

I don't want to remove the entire row if a cell is duplicate, but rather replace all duplicate values with a string.

Row No. Date
1 01.01.2023
2 cont.
3 cont.
4 02.01.2023
5 cont.
6 03.01.2023
7 cont.

Solution

  • Here is a Power Query solution:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table23"]}[Content],
    
    //Date typed as text to maintain format when loaded back to Excel
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Row No.", Int64.Type}, {"Date", type text}}),
    
    //add index column for sorting
        #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
    
    //Group by date
    // then replace all but first row with string
        #"Grouped Rows" = Table.Group(#"Added Index", {"Date"}, {
            {"Repl Dups",(t)=>Table.ReplaceValue(
                t,
                each [Index],
                "cont.",
                (x,y,z)=> if y = List.Min(t[Index]) then x else z,
                {"Date"}
            ), type table [#"Row No."=nullable number, Date=text, Index=number]}}),
    
    //Expand and sort the grouped table
    // removing unneeded columns
        #"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Date"}),
        #"Expanded Repl Dups" = Table.ExpandTableColumn(#"Removed Columns", "Repl Dups", {"Row No.", "Date", "Index"}, {"Row No.", "Date", "Index"}),
        #"Sorted Rows" = Table.Sort(#"Expanded Repl Dups",{{"Index", Order.Ascending}}),
        #"Removed Columns1" = Table.RemoveColumns(#"Sorted Rows",{"Index"})
    in
        #"Removed Columns1"
    

    enter image description here