Search code examples
splunksplunk-querysplunk-dashboardsplunk-formulasplunk-sdk

Splunk query to find those final results that has events with unique combination of values of two keys in results of main search criteria


I have a Splunk query like this:

index=my_app environment=test source="/users/sahild/app.log" "fname" OR "lname" OR "dob" OR "address" | <>

Now, from the initial results of the main query (before pipeline), I need to filter out the results/events so that for a unique combination of "cls" and "mthd" duplicate results are removed. For example, if my initial results are:

2024-08-28 23:07:30,285 INFO  {"msg":"Response: {\"Output\":[{\"Address\":\"Bay Rd\",\"PostalCode\":\"12345\","cls":"myfirstclass","mthd":"firstMethod"}

2024-08-28 23:07:30,285 INFO  {"msg":"Response: {\"Output\":[{\"Address\":\"Lincoln Rd\",\"PostalCode\":\"45678\","cls":"myfirstclass","mthd":"firstMethod"}

2024-08-28 23:07:30,285 INFO  {"msg":"Response: {\"Output\":[{\"fName\":\"John\",\"PostalCode\":\"12345\","cls":"mySecondClass","mthd":"secondMethod"}

2024-08-28 23:07:30,285 INFO  {"msg":"Response: {\"Output\":[{\"fName\":\"Emma\",\"PostalCode\":\"45678\","cls":"mySecondClass","mthd":"secondMethod"}

I want to filter out the results/events such that I get records that has a unique combined value of "cls" and "mthd", appearing only once. so final result should look something like:

2024-08-28 23:07:30,285 INFO  {"msg":"Response: {\"Output\":[{\"Address\":\"Lincoln Rd\",\"PostalCode\":\"45678\","cls":"myfirstclass","mthd":"firstMethod"}

2024-08-28 23:07:30,285 INFO  {"msg":"Response: {\"Output\":[{\"fName\":\"John\",\"PostalCode\":\"12345\","cls":"mySecondClass","mthd":"secondMethod"}

Because for the initial search I am getting hundreds of thousands of results but I don't want the repeating data for same cls and mthd. I hope my ask is clear.

I could not try much for the query after the pipeline since I don't have much knowledge of splunk regex or functions that can be used to achieve this. Need some expert help.


Solution

  • I found a solution. I need to use rex commands to pull out "cls" and "mthd" as Splunk fields so that I can use Splunk operations on those. Something like this:

    index=myIndex environment=test source="/path/to/logsfile/demo.log" ("firstName" OR "lastName") AND "mthd" AND "cls" 
    | rex field=_raw max_match=0 "(?<ClassAndMethodName>\"cls\"\:\"\w*\",\"mthd\"\:\"\w*\")" 
    | dedup ClassAndMethodName
    

    This query gives me only those events which have unique combined value of cls and mthd.