Search code examples
kqlazure-data-explorer

KQL: Create an id column every time a value changes in column x ordered by another column y


lets say i have two date time columns, i want to order this by timestamp and get a new column called changeid which is where the InitiatingProcessParentCreationTime changes values.

let data = datatable (Timestamp: datetime, InitiatingProcessParentCreationTime: datetime)
[
    datetime(2024-07-24 17:33:52.862), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:52.779), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:51.900), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:51.771), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:51.338), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:51.192), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:51.145), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:51.057), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:50.791), datetime(2024-07-17 15:22:47.997),
    datetime(2024-07-24 17:33:50.339), datetime(2024-07-17 16:49:53.432),
    datetime(2024-07-24 17:33:50.297), datetime(2024-07-17 16:49:53.432)
];
data

Solution

  • This solution creates a column to check if InitiatingProcessParentCreationTime changes, then uses row_cumsum() to make the column.

    This may not be the most efficient solution, but works.

    let data = datatable (Timestamp: datetime, InitiatingProcessParentCreationTime: datetime)
    [
        datetime(2024-07-24 17:33:52.862), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:52.779), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:51.900), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:51.771), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:51.338), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:51.192), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:51.145), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:51.057), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:50.791), datetime(2024-07-17 15:22:47.997),
        datetime(2024-07-24 17:33:50.339), datetime(2024-07-17 16:49:53.432),
        datetime(2024-07-24 17:33:50.297), datetime(2024-07-17 16:49:53.432)
    ];
    data
    | sort by Timestamp asc
    | serialize
    | extend ChangeGroup = iif(InitiatingProcessParentCreationTime != prev(InitiatingProcessParentCreationTime, 1), 1, 0)
    | extend ChangeID = row_cumsum(ChangeGroup)