Search code examples
partitioningkqlazure-data-explorer

partition by multiple columns in Kusto


I recently learned about partition function in Kusto but struggle to find a way to partition by multiple columns. Here is the case I'm failing to figure out:

I'm trying to fetch top 3 account_executive_id based on their max_sales by billable_id, organization_id, and product. But the KQL script below is returning results per each product across all billable_id, organization_id, and products.

table_1 
| where day>= ago(28d)
| summarize hint.strategy=shuffle max_sales=max(sales_amount) by billable_id, organization_id, product, account_executive_id
| partition hint.strategy=shuffle by product (top 3 by max_sales desc)

Would you please advise me how to achieve the outcome per each combination of billable_id, organization_id, and product?


Solution

  • The partition operator in Kusto only supports partitioning by a single column. To achieve your requirement of partitioning by multiple columns, you can concatenate all the columns into a single column and then use that column as a partitioning key in the query. Below is the sample query.

    table_1
    | summarize max_sales=max(sales_amount) by billable_id, organization_id, product, account_executive_id
    | extend partition_columns= strcat(billable_id," | ",organization_id," | ",product )
    | partition hint.strategy=shuffle by partition_columns (top 3 by max_sales desc)
    

    Here, the extend operator concatenates the billable_id, organization_id, and product columns into a single column called partition_columns, separated by the | character. And the partition operator partitions the data by the partition_columns column, and the top operator returns the top 3 account_executive_id for each partition, based on the max_sales value.

    demo