Search code examples
splunksplunk-query

How to label a cluster based on the first Message in splunk


I am trying to achieve below functionality

Generate a timechart showing the number of different errors occurring in a server

This I can achieve using below query

  index = "my_host" LogLevel=ERROR
  | eval Message=mvindex(field1,1) 
  | timechart count(LogLevel) BY Message

This generates a graph like below

Query Without Cluster

Which is working as expected, now the issue is when I try to cluster the message

index = "my_host" LogLevel=ERROR
  | eval Message=mvindex(field1,1) 
  | eval Message=mvindex(field1,1) | cluster t=0.2 field=Message showcount=true labelonly=true | timechart count(LogLevel) BY cluster_label

Query with cluster

The graph is exactly as expected, my challenge is now how to label as label [1, 2, 3, 4, ...] isnt user friendly

Is it possible to change this label to the Message field but still group by cluster_label?


Solution

  • You'll need to correlate the Message field to the cluster_label field and then use the new field in the timechart command. I was able to do it this like this:

    index = "my_host" LogLevel=ERROR
    | eval Message=mvindex(field1,1) 
    | eval Message=mvindex(field1,1) 
    | cluster t=0.2 field=Message showcount=true labelonly=true 
    | bin span=30m _time
    | stats count, first(Message) as CL by cluster_label, _time
    | timechart max(count) BY CL
    

    The stats command is needed to get the first Message for each cluster_label and bin is needed to group events by _time so timechart will work properly. Choose the span in the bin command to match your time window. One detriment of this method is the timechart command cannot automatically select a span since that was done by bin.