Search code examples
azureazure-data-explorerkql

Kusto: union of intermediate result produced by expensive calculation


I have an expensive query which requires a lot of CPU & Memory to produce the result. However, the result data set contains only a limited number of rows.

let result = expensive_function()
    | summarize A=xxx, B=xxx by X, Y, Z;

I want to append another row further summarized from the result. For example, omit the Z column in the summarize keys, and set Z="ALL" for the result row.

result
| union (
    result
    | summarize A=XXX, B=XXX by X, Y
    | extend Z="ALL"
)

When this is executed, it seems Kusto will expand and execute the expensive_function() in parallel in the union operator, which results in twice CPU and Memory consumption.

I tried to add hint.concurrency=1 to the union operator, this will reduce the peak memory to be same as a single result query, however, the execution time will be doubled.

Can we give a hint to Kusto that we need to freeze the intermediate result, and all the followed query should operate on the frozen intermediate result rather than calculating all from source?


Solution

  • Use the materialize() function:

    let result = materialize(expensive_function()
        | summarize A=xxx, B=xxx by X, Y, Z);
    result
    | union (
        result
        | summarize A=XXX, B=XXX by X, Y
        | extend Z="ALL"
    )