Search code examples
azure-data-explorerkql

How to format discrete events into a state-timeline compatible table?


I'm trying to use Grafana's state timeline graf and for this to work I need to transform my source data:

TimeGenerated playbook_name status runtime
11:02 config_backup success 3600
12:33 update_inventory success 52
13:04 config_backup partial_success 3600
14:31 update_inventory success 47

Into this: (see below)

I want the job column to illustrate the status for as long as "job start time " + "runtime" is within the TimeGenerated column. Does that make sense?

Like, if the config_backup job starts at 11:00 and lasts for 3600 seconds, every row in the config_backup column between 11:00 and 12:00 should be marked, ultimately with the status of the "status" column, but I'm willing to compromise.

TimeGenerated config_backup update_inventory
11:00 success
11:15 success
11:30 success
11:45 success
12:00
12:15
12:30 success
12:45
13:00 partial_success
13:15 partial_success
13:30 partial_success
so on...

There are a few steps involved here and I got fairly far but now I'm getting stuck so I need help.

First, we need to re-write the TimeGenerated into 15 minute blocks. This is how I've achieved it:

let _start=ago(1d);
let _end=now();
let interval = 15m;
let timetable = range TimeGenerated from ago(1d) to now() step 15m;
log_ansible_playbook_stats_CL
| union timetable

I also know how to use make_series to pivot the rows and columns, so for now I was focusing more on marking the status for as long as there is runtime left.

This is what I came up with

| extend runtime_s = toint(prev(runtime_s)) - 900

Thinking that, I can later do an extend case "if runtime is not empty, status = something"

The problem is, this only works for one row:

TimeGenerated playbook_name status runtime
11:00 config_backup success 3600
11:15 2700
11:30
11:45

And that's when I realized, the whole table is evaluated at once, not row-by-row as I was hoping. Now, I really don't know how to proceed. I've looked at the fillforward() function but it doesn't have a conditional so I can't stop it once we're outside the runtime of the job.

However, now that I say this I'm thinking I could emit a log at the START of the job, as well as at the END, then use fillforward() to fill the gap between the two timestamps.

But perhaps, I'm hoping that I simply missed the obvious solution and one of you can set me on the right path.

Thank you,


Solution

  • You can use the mv-expand operator to generate the additional time entries:

    datatable(TimeGenerated:datetime,playbook_name:string,status:string,runtime:int 
    )[
    datetime(11:02), "config_backup", "success", 3600,
    datetime(12:33), "update_inventory", "success", 52,
    datetime(13:04), "config_backup", "partial_success", 3600,
    datetime(14:31), "update_inventory", "success", 47,
    ]
    | mv-expand TimeGenerated=
        range(bin(TimeGenerated, 15m), TimeGenerated + runtime * 1s, 15m)
    | project TimeGenerated, playbook_name, status
    

    With mv-apply you can also generate time events with an empty status:

    datatable (
        TimeGenerated: datetime,
        playbook_name: string,
        status: string,
        runtime: int
    )[
        datetime(11:02), "config_backup", "success", 3600,
        datetime(12:33), "update_inventory", "success", 52,
        datetime(13:04), "config_backup", "partial_success", 3600,
        datetime(14:31), "update_inventory", "success", 47,
    ]
    | mv-apply TimeGenerated2= range(datetime(11:00), datetime(15:00), 15m) to typeof(datetime) on (
      extend status= iff(TimeGenerated2 between (bin(TimeGenerated, 15m) .. (TimeGenerated + runtime * 1s)), status, "")
    )
    | project TimeGenerated= TimeGenerated2, playbook_name, status
    | evaluate pivot(playbook_name, take_any(status))