I have a query in Power Query Editor that has everything i need. However, due to splitting columns and other steps within the process, i have duplicate values in a column that hyper inflates my numbers. Is there an applied step that i can add to the end of my query that will remove duplicate values in "Column 4" based on the value in "Column 5". Column 5 is a unique ID and i only need the first value under Column 4 based on the ID.
The example below is just a small example as my current output has 51 columns and quite a bit of applied steps in the current query.
duplicates overall in the column are okay, but cannot have duplicates based on the value in another column and the basic remove duplicates removes all.
A relatively simple way. Keeps the first non-blank for Column4
#"PriorStepName" = .............,
#"Grouped Rows" = Table.Group(#"PriorStepName", {"Column5"}, {{"data", each
let a = _,
b = Table.RemoveColumns(a,{"Column4"}),
c = {{List.RemoveNulls(a[Column4]){0}}},
d = Table.FromColumns(Table.ToColumns(b) & c,Table.ColumnNames(b)&{"Column4"})
in d , type table }}),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Column5"}),
#"Expanded data" = Table.ExpandTableColumn(#"Removed Columns", "data", Table.ColumnNames(#"PriorStepName"),Table.ColumnNames(#"PriorStepName"))
in #"Expanded data"