I'm trying to merge 2 queries in Power BI Desktop, matching rows based off a user and date column in one query to a row in the other query, where the user matches and the date in the 2nd query is the closest one before the date in the 1st query.
In other scenarios I need to match on more than one column, I'll usually create a composite key to match, but here's it's not a direct match.
Examples of the 2 queries are:
QUERY1
User Activity Activity Date
User 1 Activity 1 2019-01-24
User 1 Activity 2 2019-03-03
User 1 Activity 3 2019-04-17
QUERY2
User Status Status Change Date
User 1 Status 1 2019-02-05
User 1 Status 2 2019-03-06
User 1 Status 3 2019-04-05
And the merged query I'm looking for is:
MERGED QUERY
User Activity Activity Date Status
User 1 Activity 1 2019-01-24
User 1 Activity 2 2019-03-03 Status 1
User 1 Activity 3 2019-04-17 Status 3
Both queries are sourced from a REST API. If it was a SQL source, I'd use a SQL query to create a derived island table of start and stop dates based on Query2 and do a BETWEEN join against Query1 and have that be the source for Power BI.
Within the Power Query Editor, how would I get to the merged query result?
First, you want to do as you suggested and modify the status table to have start and stop dates instead of Status Change Date
. You can do this by sorting, indexing, and self-merging as I've previously explained here and here.
Once you have that, you can load a copy of the status table in each row and use the User
and Date
columns to filter the table and finally return a single value for Status
.
let
Source = <Query1 Source>
#"Added Custom" =
Table.AddColumn(Source, "Status",
(C) => List.First(
Table.SelectRows(Status,
each [User] = C[User] and
[Start] < C[Date] and
([Stop] = null or C[Date] <= [Stop])
)[Status]
),
type text)
in
#"Added Custom"
This says we take the Status
table and filter it so that based on the current row the User
matches and the Date
is between Start
and Stop
. From that filtered table, we select the Status
column, which is a list data type, so we pick the first element of the list to get the text value of the only member of the list.