Search code examples
powerquery

Power query Lookup and filtering for earliest start date


I have a current sheet that is pulling from multiple sources and has multiple rows and columns. Spreadsheet is similar to this.

File_date, Project#, Sugg_Start_date,....

I am attempting to add an additional column named First_Sugg_Start. The issue is that with every daily data dump, if the Sugg_Start_date was not started, that date adjusts to the current date of the data dump. I need to know the first date of the Sugg_Start_date for all projects. But, I only want to see the projects that show up on the most recent data dump.

Example

Date Project# Sugg_Start
01/01/2025 123456 01/01/2025
01/01/2025 123457 01/01/2025
01/01/2025 123458 01/01/2025
01/01/2025 123459 01/01/2025
01/02/2025 123457 01/02/2025
01/02/2025 123458 01/02/2025
01/02/2025 123459 01/02/2025
01/03/2025 123457 01/03/2025
01/03/2025 123458 01/03/2025
01/03/2025 123459 01/03/2025

The output I would like is

Date Project# Sugg_Start_Date First_Sugg_Start
01/03/2025 123457 01/03/2025 01/01/2025
01/03/2025 123458 01/03/2025 01/01/2025
01/03/2025 123459 01/03/2025 01/01/2025

(I have additional columns I would not like to filter out)

The issue I am having is that if I select by earliest First_Sugg_Start, I cannot display the most recent data contained in File_date. And, when I do filter, I lose the remaining columns that I also need as part of the report (for example PM, description, item, etc....)

I have tried to make an additional table, index on the Project# and pull that data in, but it fails. I have since written a query.

Let
Source = Folder.Files("C:\6.0\Task List"), 
#"Filter Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true), 
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", Transform File", each #"Transform File"([Content])), 
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name, "Source.Name"]), 
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File"}), 
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1". "Transform File", Table.ColumnNames(#"Transform File"(#"Sample File"))), 
#"Changed Type" = Table.TransformColumnTypes(#Expanded Table Column1", {{"Source.Name", type text}, {"Date", type date}, {"Project#", type text}, {"Sugg_Start", type date}, {"Task#", type text}}), 
#"Added Custom" = Table.AddColumn(#"ChangedType, "Project#_with_Task", each [Project#]&" "&[Task#]), 
#"Grouped Rows" = Table.Group(#"Added Custom", {Project#_with_Task", "Sugg_Start}, {"First_Sugg_Start"}, each List.First([Sugg_Start]), type date), 
#"Added Custom1" = Table.AddColumn(#Grouped Rows", [First_Sugg_Start])
in
#"Added Custom1"

Everything runs fine until I get to the last section:

{"First_Sugg_Start"}, each List.First([Sugg_Start]), type date), 
#"Added Custom1" = Table.AddColumn(#Grouped Rows", [First_Sugg_Start])
in
#"Added Custom1"

Thanks.


Solution

  • Seems to me that your approach is rather too complicated for a fairly simple task:

    let
        //Replace with your own data source 
        Source = Excel.CurrentWorkbook(){[Name="Table2]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Sugg_Start", type date}}),
    
        //Group by Project# to find the first Sugg_Start_date
        Grouped = Table.Group(#"Changed Type", {"Project#"}, {{"First_Sugg_Start", each List.Min([Sugg_Start]), type date}}),
    
        // Merge this result back with the original data
        Merged = Table.NestedJoin(#"Changed Type", {"Project#"}, Grouped, {"Project#"}, "Grouped", JoinKind.LeftOuter),
        Expanded = Table.ExpandTableColumn(Merged, "Grouped", {"First_Sugg_Start"}),
    
        // Filter to keep only the most recent data dump
        LatestDate = List.Max(#"Changed Type"[Date]),
        Filtered = Table.SelectRows(Expanded, each [Date] = LatestDate),
        #"Reordered Columns" = Table.ReorderColumns(Filtered,{"Date", "Project#", "Sugg_Start", "First_Sugg_Start", "Task#", "Name"})
    in
        #"Reordered Columns"
    

    enter image description here