Search code examples
splunksplunk-query

Splunk query to return list when a process' first step is logged but its last step is not


Splunk count and grouping question...

I have records for various points in a repeated process. At each point four fields are logged: _time, uniqueId, sessionId, and message.

The uniqueId and sessionId are different from each other, but are consistent each time through the process.

The message has several possible values. But I only care about when it is "firstMessage" or "lastMessage".

The logs may look something like this:

_time uniqueId message sessionId
a 12-AB firstMessage aaa-3
b 12-AB otherMessage aaa-3
c 12-AB lastMessage aaa-3
d 28-EA firstMessage bbb-5
e 28-EA otherMessage bbb-5
f 33-A7 firstMessage jjj-9
g 33-A7 otherMessage jjj-9
h 71-3C firstMessage uuu-3
i 71-3C otherMessage uuu-3
j 71-3C lastMessage uuu-3

I want a table showing all fields when the firstMessage for a uniqueId is recorded, but the lastMessage for that uniqueId is not. That is, the process starts but does not end successfully.

The output I would like, given the above, would be:

_time uniqueId message sessionId
d 28-EA firstMessage bbb-5
f 33-A7 firstMessage jjj-9

I tried:

...start of search... message="firstMessage" OR msg="lastMessage"
| stats count values(message) by uniqueId
| where count < 2
| fields _time uniqueId message sessionId

But the time and sessionId columns are blank.

Any and all help is appreciated.

Yes, _time is supposed to be your basic Splunk Date/Time. I just wasn't typing those out.


Solution

  • You just need to make your stats command create all the fields you want to have. The time you want is the min(time) for that uniqueId. You also need an aggregation for the sessionId - if the sessionId is the same for all the events that have a single uniqueId then you can use min(sessionId) for that as well:

    ...start of search... message=firstMessage OR message=lastMessage
    | stats count values(message) as message, min(_time) as _time, min(sessionId) as sessionId by uniqueId
    | where count < 2
    | table _time uniqueId message sessionId
    

    Here is a runable example - first it makes the raw data then it does the query:

    | makeresults
    | eval _raw="
    uniqueId message      sessionId
    12-AB    firstMessage aaa-3
    12-AB    otherMessage aaa-3
    12-AB    lastMessage  aaa-3
    28-EA    firstMessage bbb-5
    28-EA    otherMessage bbb-5
    33-A7    firstMessage jjj-9
    33-A7    otherMessage jjj-9
    71-3C    firstMessage uuu-3
    71-3C    otherMessage uuu-3
    71-3C    lastMessage  uuu-3
    "
    | multikv forceheader=1 fields uniqueId message sessionId
    | streamstats count
    | eval _time=_time-count
    
    | search message=firstMessage OR message=lastMessage
    | stats count values(message) as message, min(_time) as _time, min(sessionId) as sessionId by uniqueId
    | where count < 2
    | table _time uniqueId message sessionId
    

    It produces all the fields:

    _time uniqueId message sessionId
    2021-05-26 09:57:07 28-EA firstMessage bbb-5
    2021-05-26 09:57:05 33-A7 firstMessage jjj-9