Search code examples
azure-data-explorerkqlkusto-java-sdk

Dynamically return columns from a kusto function


I have a set of telemetry data stored in a table in the below format. Lets call this table as RawTelemetryData

device_id TIME ABC DEF GHI LMN
123 2021-04-20 00:00:00.0000000 1 2 3 4
121 2021-04-20 00:00:00.0000000 1 2 3 4

The above table consist of telemetry data which is coming from several IoT device every 10 seconds. I have a client program which needs different columns to be selected based on device_id.

For example. For device_id 123, client program needs ABC,GHI column. Where as for device_id 121 client program expects to have only ABC,DEF.

Initially I was thinking of writing a function which accepts device_id and the columns to be selected. like the below. Where the first argumet is the device id and second argument is the columns to be selected. Say I need to select ABC,GHI of device id 123 ill call the function like this.

Approach 1

getDataByDeviceId(123,"ABC,GHI") This function should only project only ABC and GHI

 .create-or-alter function with (folder = "getData", skipvalidation = "true") getDataByDeviceId(device_id:int,columns:string) {  
  
}

But I couldn't get this working.

Approach 2

In this approach I preconfigured columns for each device_id in a separate table. In this case I tried to write a function which accepts only device id, and project columns which are configured in the below table.

DEVICE_COLUMN_MAPPING_TABLE

device_id columns
123 ABC
123 DEF
123 GHI
123 LMN
121 ABC
121 DEF
121 GHI
 .create-or-alter function with (folder = "getData", skipvalidation = "true") getDataByDeviceId(device_id:int) {  
 //program should look at  DEVICE_COLUMN_MAPPING_TABLE for colums for this particular device id and project only that.
}

But unfortunately I couldn't get this also working. :(

I want to understand is it possible to get any of the above approach working. If not i have only 1 solution which I can think of , that is construct the query with proper project statement from the source program(A java program, which calls kusto function) based on device_id and invoke to get the results. I was trying to avoid doing this and make all my logic on ADX itself. Any other approach to solve the problem also welcomed :)


Solution

  • Here is a solution based on your second approach:

    let device_mappings = datatable(device_id:long, columns_output:string)[
             123,    "ABC",
             123,   "GHI",
             121,   "ABC",
             121,   "DEF",
             122,    "LMN"];
    let dt = datatable(device_id:long,TIME:datetime,ABC:long,DEF:long,GHI:long,LMN:long) 
             [123,datetime(2021-04-20 00:00:00.0000000),1,2,3,4,
              123,datetime(2021-04-21 00:00:00.0000000),10,20,30,40,
              121,datetime(2021-04-20 00:00:00.0000000),1,2,3,4,
              122,datetime(2021-04-20 00:00:00.0000000),1,2,3,4];
    let func = view(id:long) {
    let idKeys = toscalar(device_mappings | where device_id == id | summarize make_set(columns_output));
    let otherKeys = toscalar(device_mappings | where device_id != id | summarize make_set(columns_output));
    let keysToRemove = set_difference(otherKeys,idKeys);
    dt
    | where device_id == id
    | extend rec = pack_all()
    | project filteredRec = bag_remove_keys(rec, keysToRemove)
    | evaluate bag_unpack(filteredRec)
    };
    func(123)
    

    Output: enter image description here