Search code examples
splunkdata-extractionsplunk-query

Splunk field extractions from different events & delimiters


My Splunk log format of key event timestamps is as below :

[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=Hour = 18-nov-2020 11:00:00]
[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=Id = 126566]
[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=zipBefore = 18-nov-2020 12:27:08.776174]
[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=zipAfter = 18-nov-2020 12:36:52.718122]
[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=StartTime = 18-nov-2020 12:17:10.603227]
[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=EndTime = 18-nov-2020 12:36:53.094513]
[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=beginThread = 18-nov-2020 12:17:10.905782]
[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=endThread = 18-nov-2020 12:24:22.628907]
[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=LogTime = CASE1~6~18-nov-2020 12:17:11.377070~0~18-nov-2020 12:17:12.608526,CASE1~0~18-nov-2020 12:17:11.365409~0~18-nov-2020 12:17:12.654285,CASE3~0~18-nov-2020 12:17:12.644921~11~18-nov-2020 12:17:13.636655,CASE2~5~18-nov-2020 12:17:13.295225~700000~18-nov-2020 12:23:29.370142,CASE2~2~18-nov-2020 12:17:12.815714~700000~18-nov-2020 12:23:31.400500]

I would like to extract all the key event timestamps into table fields like below so i could do a difference between them etc :

Hour                  Id      StartTime                    EndTime                      beginThread                 endThread                   zipBefore                    zipAfter
18-nov-2020 11:00:00  126566  18-nov-2020 12:17:10.603227  18-nov-2020 12:36:53.094513  18-nov-2020 12:17:10.905782 18-nov-2020 12:24:22.628907 18-nov-2020 12:27:08.776174  18-nov-2020 12:36:52.718122

Also, my last event in log has difference cases, threads & timestamp which I need to extract separately based on delimiter something like below :

Case Thread StartTime                    Count EndTime
CASE1     6 18-nov-2020 12:17:11.377070      0 18-nov-2020 12:17:12.608526
CASE1     0 18-nov-2020 12:17:11.365409      0 18-nov-2020 12:17:12.654285
CASE2     5 18-nov-2020 12:17:13.295225 700000 18-nov-2020 12:23:29.370142
CASE2     2 18-nov-2020 12:17:12.815714 700000 18-nov-2020 12:23:31.400500
CASE3     0 18-nov-2020 12:17:12.644921     11 18-nov-2020 12:17:13.636655

Solution

  • Below is an example query that accomplishes the first task. One problem, however, is it only works with a single set of events. Since there is no obvious connection among the 8 events, there is no separate the 8 of one transaction from the 8 of another transaction.

    | makeresults | eval data="[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=Hour = 18-nov-2020 11:00:00]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=Id = 126566]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=zipBefore = 18-nov-2020 12:27:08.776174]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=zipAfter = 18-nov-2020 12:36:52.718122]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=StartTime = 18-nov-2020 12:17:10.603227]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=EndTime = 18-nov-2020 12:36:53.094513]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=beginThread = 18-nov-2020 12:17:10.905782]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=endThread = 18-nov-2020 12:24:22.628907]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=LogTime = CASE1~6~18-nov-2020 12:17:11.377070~0~18-nov-2020 12:17:12.608526,CASE1~0~18-nov-2020 12:17:11.365409~0~18-nov-2020 12:17:12.654285,CASE3~0~18-nov-2020 12:17:12.644921~11~18-nov-2020 12:17:13.636655,CASE2~5~18-nov-2020 12:17:13.295225~700000~18-nov-2020 12:23:29.370142,CASE2~2~18-nov-2020 12:17:12.815714~700000~18-nov-2020 12:23:31.400500]" | eval data=split(data,"!") | mvexpand data | eval _raw=data
    ```The above just sets up test data```
    | rex "\[Message=(?<Message>[^\]]+)"
    | rex field=Message "(?<field>\w+)\s+=\s+(?<value>.*)"
    | eval {field}=value
    | table Hour Id StartTime EndTime beginThread endThread zipBefore zipAfter
    | filldown | tail 1
    

    The second part is similar to the first, but the parsing is different.

    | makeresults | eval data="[Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=Hour = 18-nov-2020 11:00:00]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=Id = 126566]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=zipBefore = 18-nov-2020 12:27:08.776174]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=zipAfter = 18-nov-2020 12:36:52.718122]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=StartTime = 18-nov-2020 12:17:10.603227]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=EndTime = 18-nov-2020 12:36:53.094513]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=beginThread = 18-nov-2020 12:17:10.905782]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=endThread = 18-nov-2020 12:24:22.628907]!
    [Date=2020-11-18] [Time=12:36:53] [Mode=DEBUG] [Class=PrintUtil] [Line=557] [Message=LogTime = CASE1~6~18-nov-2020 12:17:11.377070~0~18-nov-2020 12:17:12.608526,CASE1~0~18-nov-2020 12:17:11.365409~0~18-nov-2020 12:17:12.654285,CASE3~0~18-nov-2020 12:17:12.644921~11~18-nov-2020 12:17:13.636655,CASE2~5~18-nov-2020 12:17:13.295225~700000~18-nov-2020 12:23:29.370142,CASE2~2~18-nov-2020 12:17:12.815714~700000~18-nov-2020 12:23:31.400500]" | eval data=split(data,"!") | mvexpand data | eval _raw=data
    ```The above just sets up test data```
    | rex "\[Message=(?<Message>[^\]]+)"
    | rex field=Message "(?<field>\w+)\s+=\s+(?<value>.*)"
    | eval {field}=value
    ```We only care about LogTime messages```
    | search LogTime=*
    ```Divide the message on commas and make separate events```
    | eval LogTime=split(LogTime, ",") | mvexpand LogTime
    ```Parse the events```
    | rex field=LogTime "(?<Case>[^~]+)~(?<Thread>[^~]+)~(?<StartTime>[^~]+)~(?<Count>[^~]+)~(?<EndTime>[^~]+)(?:,|$)"
    | table Case Thread StartTime Count EndTime