O365
I'm using PQ to ETL a dataset into Excel.
ID | Status | Date | CurrentStatus |
---|---|---|---|
1 | Active | 1/1/2022 | Terminated |
1 | Terminated | 1/10/2022 | Terminated |
Under CurrentStatus, lookup the row's ID and find the record with the latest Date for that ID then return the Status for that record.
Current setup:
Formula is attached to the output table.
Current formula =XLOOKUP([@ID]&MAXIFS([Date],[ID],[@ID]),[ID]&[Date],[Status],"")
Problem:
Calculation takes forever
Goal:
Build the solution into the PQ M code instead of the formula.
Any guidance is appreciated, thanks.
In powerquery, you can group on ID, and within that, sort on date then take the most recent result. Then expand the status
Load the data into powerquery with data from table/range, right click the ID column and Group by
take the default code
= Table.Group(#"Changed Type", {"ID"}, {{"Count", each Table.RowCount(_), Int64.Type}})
and replace the end to resemble this:
= Table.Group(#"Changed Type", {"ID"}, {{"data", each _, type table }, {"MaxStatus", each Table.FirstN(Table.Sort(_,{{"Date", Order.Descending}}),1)[Status]{0}}})
then use the arrow atop the new column to [x] expand the Status, Date and CurrentStatus columns
Full code:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Status", type text}, {"Date", type date}, {"CurrentStatus", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"data", each _, type table }, {"MaxStatus", each Table.FirstN(Table.Sort(_,{{"Date", Order.Descending}}),1)[Status]{0}}}),
#"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Status", "Date", "CurrentStatus"}, {"Status", "Date", "CurrentStatus"})
in #"Expanded data"