Search code examples
powerbidaxpowerquerym

PowerQuery - Finding Most Recent Date Based on Given Value and Current Date


I have a table called 'Table1' which has two columns being 'Name' and 'Date'.

In Power Query Editor, I want to create a custom column called 'Last Date' to find the most recent date when the given Name appeared last time.

For instance, B firstly appeared on 5/7/2019, then on 11/8/2019, and last on 17/9/2019. So the Last Date for B when the Date is 11/8/2019 is 5/7/2019, and the Last Date for B when the Date is 17/9/2019 is 11/8/2019. Please see below example.

enter image description here

I did some research online but got confused with custom function, VAR, Measure, MAXX, CALCULATE, FILTER, etc...

I am not familiar with DAX nor Advanced DAX Editor so if possible please provide detailed answer or clear steps of how to apply your solution.

Let me know if I did not make myself clear. Otherwise your prompt help is greatly appreciated!!


Solution

  • A custom function might look something like:

    let
        AddLastDateColumn = (someTable as table) as table =>
            let
                initialHeaders = Table.ColumnNames(someTable),
                sorted = Table.Sort(someTable, {{"Date", Order.Ascending}, {"Name", Order.Ascending}}),
                merged = Table.NestedJoin(sorted, {"Name"}, sorted, {"Name"}, "$joined", JoinKind.LeftOuter),
                lastDateColumn = Table.AddColumn(merged, "Last Date", each
                    let
                        maxDate = [Date],
                        filtered = Table.SelectRows([#"$joined"], each [Date] < maxDate),
                        lastRow = if not Table.IsEmpty(filtered) then Table.Last(filtered)[Date] else null // Could use Table.Max, but data is already sorted.
                    in lastRow,
                    type nullable date),
                dropColumns = Table.SelectColumns(lastDateColumn, initialHeaders & {"Last Date"})
            in dropColumns
    in
        AddLastDateColumn
    

    If you save the above as its own query, you can then access it in other queries. For example, if you save the above as a query named AddLastDateColumn, you can then access it in other queries (as below):

    let
        sourceTable =
            let
                nameColumn = {"A", "B", "C", "D", "E", "F", "G", "A", "B", "C", "D", "E", "F", "G", "A", "B", "C", "D", "E", "F", "G"},
                dateColumn = {#date(2019,7,1), #date(2019,7,5), #date(2019,7,14), #date(2019,7,23), #date(2019,7,24), #date(2019,8,1), #date(2019,8,5), #date(2019,8,10), #date(2019,8,11), #date(2019,8,17), #date(2019,8,23), #date(2019,8,25), #date(2019,9,3), #date(2019,9,4), #date(2019,9,13), #date(2019,9,17), #date(2019,9,23), #date(2019,9,27), #date(2019,9,28), #date(2019,10,6), #date(2019,10,9)},
                toTable = Table.FromColumns({nameColumn, dateColumn}, type table [Name = text, Date = date])
            in toTable,
        invokeFunction = AddLastDateColumn(sourceTable)
    in
        invokeFunction
    

    I was looking at the documentation for Table.NestedJoin (https://learn.microsoft.com/en-us/powerquery-m/table-nestedjoin). It seems to have a parameter named keyEqualityComparers:

    An optional set of keyEqualityComparers may be included to specify how to compare the key columns.

    I don't have time to look into its capabilities and the syntax it requires, but perhaps it could be used to more elegantly specify the JOIN criteria: "Name must match exactly. Right Date must be the greatest date that is less than Left Date".

    In any case, I think the above function should do what you were after.