Search code examples
azure-data-explorerkql

Kusto: How summarize calculated data


I have start and end calculated columns which I have read from Table1. And comparing the how many events are happened in this between time .

Input Data:
    let Mytable1=datatable (Vin:string,start_time:datetime ,End_time:datetime )
    [ABC,datetime(2021-03-18 08:49:08.467), datetime(2021-03-18 13:32:28.000),
    ABC,datetime(2021-03-18 13:41:59.323),datetime(2021-03-18 13:41:59.323),
    ABC,datetime(2021-03-18 13:46:59.239),datetime(2021-03-18 14:58:02.000)];

    let Mytable2=datatable(Vin:string,Timestamp:datetime)
    [ABC,datetime(2021-03-18 08:49:08.467),ABC,datetime(2021-03-18 08:59:08.466),ABC,datetime(2021-03-18 09:04:08.460),ABC,datetime(2021-03-18 13:24:27.0000000)];

Query:

let Test=Table1 
|where Vin =="ABC" | distinct Vin,Start_Time,End_Time;
let min1=toscalar(Test |summarize min1= min(Start_Time));
let max1=toscalar(Test |summarize max1=max(End_Time));
Table2
|where Vin =="ABC" and  Timestamp between (todatetime(min1) ..todatetime(max1))
| join  kind=fullouter   Test
on  $left.Vin == $right.Vin and $left.Timestamp== $right.Start_Time 
|summarize Events= (count())  by Timestamp,Vin,Start_Time,End_Time
|project Timestamp,Start_Time,End_Time,Events

Output of above query is : enter image description here But My expected output is : enter image description here Means Events count from between two start and end time.


Solution

  • You should not have timestamp in your final aggregation. A working example could look like:

    let measurement_range=datatable (vin:string,start_time:datetime ,end_time:datetime )
    ["ABC",datetime(2021-03-18 08:49:08.467),datetime(2021-03-18 13:32:28.000),
     "ABC",datetime(2021-03-18 13:41:59.323),datetime(2021-03-18 13:44:59.323),
     "ABC",datetime(2021-03-18 13:46:59.239),datetime(2021-03-18 14:58:02.000),
     ];
    let measurement=datatable(vin:string,timestamp:datetime)
        ["ABC",datetime(2021-03-18 08:49:08.467),
         "ABC",datetime(2021-03-18 08:59:08.466),
         "ABC",datetime(2021-03-18 09:04:08.460),
         "ABC",datetime(2021-03-18 13:42:27.0000000)];
    measurement_range
    | join kind=inner (measurement) 
            on vin     
    | where timestamp between (start_time..end_time)
    | summarize event=(count()) by vin, start_time, end_time
    

    With this you get a count for your measurement window. In this example you get a large intermediate resultset, as the timerange is considered in the where statement. Please see the Azure Data Explorer Documentation how to optimize time window joins (the example is not efficient for larger datasets).