Search code examples
splunksplunk-query

How to extract a value from fields when using stats()


Query:

index = test
| stats values(*) as * by ip_addr, location
| where location="USA"
| fields timestamp, user, ip, location, message

Result:

+--------------------------------------------------------------------+
| timestamp         | user   | ip          | location | message      |
+--------------------------------------------------------------------+
| 08/08/2020 17:00  | thomas | 10.10.10.10 | USA      | Hello, world!|
| 08/08/2020 17:05. | unknown|             |          | I love steak!|
| 08/08/2020 17:10. |        |             |          | I love soda! |
+--------------------------------------------------------------------+
| 08/08/2020 17:00  | jeffry | 10.10.10.20 | USA      | Hello, world!|
| 08/08/2020 17:35  | unknown|             |          | I love pancke|
| 08/08/2020 17:40  |        |             |          | I love waffle|
+--------------------------------------------------------------------+

I want to:

  1. make those multiple timestamps become one single timestamp
  2. remove the "unknown" value in the "user" field
  3. make "message" field to display only the "Hello, world!" - I dont care about the rest.

I tried to do:

index = test
| stats values(*) as * by ip_addr
| where location="USA"
| eval user=replace(user, "unknown", "")
| fields timestamp, user, ip, location, message

But it removes all the values under "user" field. Any advice? My number 2 and number 3 goals look similar. If I could crack either one of them, I think I could solve the other one easily.


Solution

  • Yes, timestamps can be averaged, if they are in epoch (integer) form. The result of the values(*) function is a multi-value field, which doesn't work well with replace or most other commands and functions not designed for them. That's why I use the mvfilter and mvdedup commands below.

    index = test
    | where location="USA"
    | stats earliest(timestamp) as timestamp, values(*) as * by ip_addr
    | eval user=mvdedup(mvfilter(!match(user, "unknown")),
     message=mvdedup(mvfilter(match(message, "Hello, world!"))
    | fields timestamp, user, ip, location, message