Search code examples
azure-data-explorerkql

The unix uniq -c command equivalent (sort of) in KQL?


The uniq command filters out duplicate lines. What is important is that it does not change the order of the input lines, so:

$ echo "1,1,1,2,1,1" | tr ',' '\n' | uniq -c
      3 1
      1 2
      2 1

As we can see for every batch of identical values it specifies the count of values and then the value itself.

I am looking for an equivalent in KQL. Of course, I would like to run some aggregate functions on some fields, but the essence is the same. So far I was using the summarize operator, but it is not the same, because it groups all the instances of the identical values, even if in my ordering there are other records between them. It is like the difference between sort | uniq -c and uniq -c.

How can I summarize without reordering?


Solution

  • datatable(timestamp:int,val:int)[7,1 ,9,1 ,12,1 ,13,2 ,19,1 ,24,1 ,25,1, 26,1 ,27,2, 29,2 ,33,2 ,36,1, 42,1]
    | order by timestamp asc
    | extend group_id = row_number() - row_number(1, prev(val) != val)
    | summarize val_occurrences = count(), from_timestamp = min(timestamp), to_timestamp = max(timestamp) by group_id, val
    | project-away group_id
    | order by from_timestamp asc
    

    Raw data:

    +-----------+-----+
    | timestamp | val |
    +-----------+-----+
    |         7 |   1 |
    |         9 |   1 |
    |        12 |   1 |
    |        13 |   2 |
    |        19 |   1 |
    |        24 |   1 |
    |        25 |   1 |
    |        26 |   1 |
    |        27 |   2 |
    |        29 |   2 |
    |        33 |   2 |
    |        36 |   1 |
    |        42 |   1 |
    +-----------+-----+
    

    Results:

    +-----+-----------------+----------------+--------------+
    | val | val_occurrences | from_timestamp | to_timestamp |
    +-----+-----------------+----------------+--------------+
    |   1 |               3 |              7 |           12 |
    |   2 |               1 |             13 |           13 |
    |   1 |               4 |             19 |           26 |
    |   2 |               3 |             27 |           33 |
    |   1 |               2 |             36 |           42 |
    +-----+-----------------+----------------+--------------+