Search code examples
azure-data-explorerkqlkusto-explorer

Join when a date is within a date range ( Kusto / KQL / Azure Data Explorer )


In KQL, how can you add criteria for a join? For example, the query below shows a join. I only want to join rows when the 'code' column is equal and when 'date' is between StartDate and EndDate. I know this is possible in SQL but have not seen a working example in KQL.

Please keep in mind that the example below is not the actual dataset. I know I can achieve the desired result by filtering post join; however, this is not possible with the actual dataset due to the size of table B. When I try to filter post join, I run out of memory (runaway query).

let A = datatable(Code:string, StartDate:string, EndDate:string, Rate:string)[
'EUR', '2020-01-01', '2020-01-15', '1.5',
'EUR', '2020-01-15', '2020-01-31', '1.7',
'YEN', '2020-01-01', '2020-01-15', '2.1',
'YEN', '2020-01-15', '2020-01-31', '1.9'];
let B = datatable(Key:string, Code:string, Date:string)[
'1', 'EUR', '2020-01-02', 
'2', 'EUR', '2020-01-16', 
'3', 'EUR', '2020-01-13', 
'4', 'EUR', '2020-01-27', 
'5', 'YEN', '2020-01-03', 
'6', 'YEN', '2020-01-03', ];
A | join kind=inner B on Code 
| project-away *1, StartDate, EndDate
| summarize by Key, Date, Code, Rate

Current Result

Key Date        Code    Rate
1   2020-01-02  EUR     1.7
1   2020-01-02  EUR     1.5
2   2020-01-16  EUR     1.7
2   2020-01-16  EUR     1.5
3   2020-01-13  EUR     1.7
3   2020-01-13  EUR     1.5
4   2020-01-27  EUR     1.7
4   2020-01-27  EUR     1.5
5   2020-01-03  YEN     1.9
5   2020-01-03  YEN     2.1
6   2020-01-03  YEN     1.9
6   2020-01-03  YEN     2.1

Expected Result

Key Date        Code    Rate
1   2020-01-02  EUR     1.5
2   2020-01-16  EUR     1.7
3   2020-01-13  EUR     1.5
4   2020-01-27  EUR     1.7
5   2020-01-03  YEN     2.1
6   2020-01-03  YEN     2.1

Solution

  • You can add a filter after the join and use the Time window join technique

    let lookupWindow = 1d;
    let A = datatable(Code:string, StartDate:datetime , EndDate:datetime , Rate:string)[
    'EUR', '2020-01-01', '2020-01-15', '1.5',
    'EUR', '2020-01-15', '2020-01-31', '1.7',
    'YEN', '2020-01-01', '2020-01-15', '2.1',
    'YEN', '2020-01-15', '2020-01-31', '1.9'];
    let B = datatable(Key:string, Code:string, Date:datetime )[
    '1', 'EUR', '2020-01-02', 
    '2', 'EUR', '2020-01-16', 
    '3', 'EUR', '2020-01-13', 
    '4', 'EUR', '2020-01-27', 
    '5', 'YEN', '2020-01-03', 
    '6', 'YEN', '2020-01-03', ];
    A 
    | extend TimeKey = range(bin(StartDate, lookupWindow), bin(EndDate,lookupWindow), lookupWindow)
    | mv-expand TimeKey to typeof(datetime)
    | join kind=inner (B 
        | extend TimeKey = bin(Date, 1d)
    ) on Code,TimeKey
    | where Date between (StartDate .. EndDate)
    | project-away  StartDate, EndDate, TimeKey*, Code1
    
    Code Rate Key Date
    EUR 1.5 1 2020-01-02 00:00:00.0000000
    EUR 1.5 3 2020-01-13 00:00:00.0000000
    EUR 1.7 2 2020-01-16 00:00:00.0000000
    EUR 1.7 4 2020-01-27 00:00:00.0000000
    YEN 2.1 5 2020-01-03 00:00:00.0000000
    YEN 2.1 6 2020-01-03 00:00:00.0000000