Search code examples
splunksplunk-querysplunk-formulasplunk-calculation

Splunk: Find events that don't have a certain attribute logged as different log lines


We have Splunk logs like:

ts=20:10:01 id=1 state=first foo=bar
ts=20:10:05 id=1 state=second foo=bar
ts=20:10:06 id=1 state=third foo=bar

ts=20:10:03 id=2 state=first foo=bar

ts=20:11:01 id=3 state=first foo=bar
ts=20:11:03 id=3 state=second foo=bar
ts=20:11:05 id=3 state=third foo=bar

I would like to find all id that does not have the other 2 states. In this example all id=2 that logged first state but not the other 2. I was reading on JOIN and found that it can only look at events that occur with both events but we can't exclude those events.

index=my-idx foo=bar 
| join id type=outer
  [search index=my-idx foo=bar NOT (state=second OR state=third) | table id]
| table id

The Query I am thinking of should return a list of ids that don't have the state=second or state=third which in the above example should return id=2


Solution

  • Here's a run-anywhere example query that should do it. Comments in the query explain what it does. It assumes the first state of any id is always "first".

    | makeresults 
    | eval data="ts=20:10:01 id=1 state=first foo=bar;
    ts=20:10:05 id=1 state=second foo=bar;
    ts=20:10:06 id=1 state=third foo=bar;
    ts=20:10:03 id=2 state=first foo=bar;
    ts=20:11:01 id=3 state=first foo=bar;
    ts=20:11:03 id=3 state=second foo=bar;
    ts=20:11:05 id=3 state=third foo=bar"
    | eval data=split(data,";")
    | mvexpand data
    | eval _raw=data
    | extract kvdelim=" ", pairdelim="="
    | fields ts,id,state,foo
    ```Above just sets up test data```
    ```Count how many different states each id has```
    | streamstats dc(state) as count by id
    ```Find the highest count for each id```
    | eventstats max(count) as max by id
    ```Select only those with a single state```
    | where max=1
    | table ts id state foo