Search code examples
excelexcel-formulapowerquerym

Power Query M - return most recent column value


O365

I'm using PQ to ETL a dataset into Excel.

ID Status Date CurrentStatus
1 Active 1/1/2022 Terminated
1 Terminated 1/10/2022 Terminated

Under CurrentStatus, lookup the row's ID and find the record with the latest Date for that ID then return the Status for that record.

Current setup:

Formula is attached to the output table.

Current formula =XLOOKUP([@ID]&MAXIFS([Date],[ID],[@ID]),[ID]&[Date],[Status],"")

Problem:

Calculation takes forever

Goal:

Build the solution into the PQ M code instead of the formula.

Any guidance is appreciated, thanks.


Solution

  • In powerquery, you can group on ID, and within that, sort on date then take the most recent result. Then expand the status

    Load the data into powerquery with data from table/range, right click the ID column and Group by

    take the default code

    = Table.Group(#"Changed Type", {"ID"}, {{"Count", each Table.RowCount(_), Int64.Type}})
    

    and replace the end to resemble this:

    = Table.Group(#"Changed Type", {"ID"}, {{"data", each _, type table },     {"MaxStatus", each Table.FirstN(Table.Sort(_,{{"Date", Order.Descending}}),1)[Status]{0}}})
    

    then use the arrow atop the new column to [x] expand the Status, Date and CurrentStatus columns

    Full code:

    let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Status", type text}, {"Date", type date}, {"CurrentStatus", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"ID"}, {{"data", each _, type table },     {"MaxStatus", each Table.FirstN(Table.Sort(_,{{"Date", Order.Descending}}),1)[Status]{0}}}),
    #"Expanded data" = Table.ExpandTableColumn(#"Grouped Rows", "data", {"Status", "Date", "CurrentStatus"}, {"Status", "Date", "CurrentStatus"})
    in #"Expanded data"
    

    enter image description here