Search code examples
amazon-web-servicesaws-cloudwatch-log-insights

Transform parsed field depending on another parsed field's value


I have four months of logs for the time it takes some internal process to finish.
The problem is that this time is given in different units (millis, seconds, minutes) depending on how long it took (not my decision, this is legacy code).
Example:

Process finished after 732 ms
Process finished after 3.53 s
Process finished after 10.84 min

I want to use Log Insights to obtain a graphical representation of this time distribution.

  • I know how to filter @message like /Process finished after/ to get only the messages I need.
  • I know how to parse @message " after * *" as t_elapsed, t_unit to extract the time and unit into variables.
  • I don't know how to convert all values to the same unit (as in, divide t_elapsed by 1000 if t_unit is "ms", multiply it by 60 if t_unit is "min", or leave it as-is if t_unit is "s")
  • I'm not sure about how to graph the values in such a way that it looks like a distribution. Maybe something along the lines of stats count(*) as c by ceil(t_elapsed) with a bar-type visualization? But this obviously needs all t_elapsed values to be in the same unit...

So, if I manage to convert all t_elapsed values to seconds, I think I'll be able to get a proper graph.

Any tips?

PS: advice about the graph itself is also welcome.


Solution

  • AWS CloudWatch Logs Insights does not support conditional statements as SQL for example does, but this is how I worked around it:

    As time units (in your example) are limited to s, ms and min, it is possible to parse the durations explicitly (t_elapsed_ms, t_elapsed_s and t_elapsed_min) .
    Using the fields statement again, I created new fields for the converted values. Luckily the arithmetic operations work with null values in our favour and return null if one of the operands is null.
    Finally, using display with coalesce, the only not-null t_elapsed* field is returned as elapsed_s.

    fields @timestamp, @message, @logStream, @log
    | parse @message " after * ms" as t_elapsed_ms
    | parse @message " after * s" as t_elapsed_s
    | parse @message " after * min" as t_elapsed_min
    | fields t_elapsed_min * 60 as t_elapsed_min_as_s, t_elapsed_ms / 1000 as t_elapsed_ms_as_s
    | display coalesce(t_elapsed_min_as_s, t_elapsed_ms_as_s, t_elapsed_s) as elapsed_s
    | sort @timestamp desc
    | limit 1000
    

    See AWS documentation on boolean, comparison, numeric, datetime, and other functions for CloudWatch Logs Insights service.

    Here is my test: enter image description here