Search code examples
excelpowerbipowerquerypowerbi-desktopm

How to find the greatest value per ID in Microsoft Excel Power Query?


JOB_ID DAY_ID
430 120
430 720
530 120
530 620
630 120
630 120
630 620
630 620

Should become:

JOB_ID DAY_ID
430 720
530 620
630 620
630 620

Where only the rows where day_id is the maximum value is kept.

I'm currently using Excel's Power Query, so Python and R scripts are not available.


Solution

    • Group by JOB_ID
    • Aggregate by filtering each subtable by the maximum DAY_ID
    let
    
    //Change next line to reflect actual data source
        Source = Excel.CurrentWorkbook(){[Name="Table20"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"JOB_ID", Int64.Type}, {"DAY_ID", Int64.Type}}),
    
    //Group by JOB_ID
    //  Filter each subtable by maximum DAY_ID
        #"Grouped Rows" = Table.Group(#"Changed Type", 
            {"JOB_ID"}, {
                {"DAY_ID", (t)=>Table.SelectRows(t, each [DAY_ID]=List.Max(t[DAY_ID])), 
                        type table [JOB_ID=Int64.Type, DAY_ID=Int64.Type]}}),
        #"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"JOB_ID"}),
        
        #"Expanded DAY_ID" = Table.ExpandTableColumn(#"Removed Columns", "DAY_ID", {"JOB_ID", "DAY_ID"})
    in
        #"Expanded DAY_ID"
    

    enter image description here