Search code examples
splunksplunk-query

Filtering duplicate entries from Splunk events


I am new to splunk and have got some splunk events as below

2019-06-26 23:45:36 INFO ID 123456 | Response Code 404 2019-06-26 23:55:36 INFO ID 123456 | Response Code 404 2019-06-26 23:23:36 INFO ID 258080 | Response Code 404

Is there way to filter out the first two events as they have the same ID 123456 and view them as one event? I tried something which I know is completely wrong, suggestions might be very useful on this.

index=myindex "Response Code 404" | rex field=ID max_match=2 "(?<MyID>\b(?:123456)\b)" | stats count by ID MyID | where count > 1


Solution

  • That's not completely wrong. It's one of the legitimate ways to remove duplicates. Here's another:

    index=myindex "Response Code 404"  
    | rex field=ID max_match=2 "(?<MyID>\b(?:123456)\b)" 
    | dedup MyID
    

    Using dedup is often preferred because it doesn't remove fields the way stats does.